Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for 2-dimensional coord in add_cyclic_point. #1753

Merged
merged 1 commit into from
Nov 5, 2021
Merged

Added support for 2-dimensional coord in add_cyclic_point. #1753

merged 1 commit into from
Nov 5, 2021

Conversation

mcuntz
Copy link
Contributor

@mcuntz mcuntz commented Mar 22, 2021

Added support for 2-dimensional coord in add_cyclic_point.
Optional adding point on 2-dimensional 'latitudes' as well, called rowcoord.
Routine checks that cyclic point is not yet present if coord is given.

Rationale

I am developing a GUI for viewing netCDF files, with similarities to ncview and Panoply, that is using cartopy as its backend: ncvue.
If the tool finds 2-dimensional variables for latitudes and longitudes, it uses the 2-dimensional variables rather than any 1-dimensional variables that might be present. Like this a lot of irregular data can be visualised. For example, I was working with data from different sources that were produced on different rotated grids cuizinart. However, they provide 2-dimensional real latitudes and longitudes. So I could visualise almost all that data. There was only one problem with a Yin-Yang grid.
But of course I had to add cyclic points to data, longitudes and latitudes. So I included it in the add_cyclic_point routine.
I added appropriate tests.

Implications

There are hopefully no implications for the 1-dimensional functionalities. It should basically add new features.

However, I changed the way the cyclic point is set for the coord variables. While before the cyclic point was the last longitude plus the difference to the longitude before, I simply add 360 to the first longitude.
The first method did not work so well with the rotated grids were the 2-dimensional longitudes were computed with Proj4 and they are not equal-distant.

There is one use case where this does not work but neither did the former method: Proj4 computed very small first longitudes of the order of 1e-5 (instead of 0). If the dimension is of type float32 one has about 6 significant digits and adding 360 to 1e-5 gives 360 and not 360.00001. So cartopy did not identify 360 as a cyclic point because it was actually smaller than (to the west of) the first latitude.

Kind regards,
Matthias

lib/cartopy/util.py Outdated Show resolved Hide resolved
lib/cartopy/util.py Outdated Show resolved Hide resolved
lib/cartopy/util.py Outdated Show resolved Hide resolved
lib/cartopy/util.py Outdated Show resolved Hide resolved
@SciTools-assistant SciTools-assistant added the Blocked: CLA needed See https://scitools.org.uk. Submit the form at: https://scitools.org.uk/cla/v4/form label Mar 22, 2021
@mcuntz
Copy link
Contributor Author

mcuntz commented Mar 22, 2021

Removed multiple spaces before operator E221 (error from @stickler-ci).
Removed one space from doctest output, which made the checks fail. I did not have this error probably because I had used doctest.NORMALIZE_WHITESPACE.

@mcuntz
Copy link
Contributor Author

mcuntz commented Apr 5, 2021

I did sign the SciTools Contributor License Agreement (CLA) on 21.03.2021. I probably did that after the first PR. I could attach the PDF if needed.

@mcuntz
Copy link
Contributor Author

mcuntz commented May 1, 2021

I have signed the Contributor Licence Agreement (v4) document again.
Matthias

@greglucas greglucas removed the Blocked: CLA needed See https://scitools.org.uk. Submit the form at: https://scitools.org.uk/cla/v4/form label May 9, 2021
@greglucas
Copy link
Contributor

I'm a little bit confused by the rowcoord option you put in here. Is what you're trying to do is also repeat the latitude coordinates, so that all lon, lat, data have the same dimensions on output? It almost seems like this function would be better served as a cyclic(X, Y, [Z]) ordering instead to make it a little more clear.

What if someone is wanting a cyclic point with something other than degrees? You've added some explicit hard-coded values like 360 in there that I don't think should be required.

I don't think the np.ma's are required here either? They should be dispatched to the proper function if you pass in a MaskedArray to the np.sin(x) function.

@mcuntz
Copy link
Contributor Author

mcuntz commented May 11, 2021

Dear Greg,

thanks for responding. You are absolutely correct that rowcoord is for the latitudes. It is only considered if coord is 2-dimensional.

I tried to follow as much as possible the original routine. It is 100% backwards compatible, except for the calculation of the cyclic point itself (see below).
The longitudes were called coord. So I called the latitudes rowcoord.
coord was also just an optional argument. One could, of course, call it as add_cyclic_point(Z[, X]). And so you can call extended version as add_cyclic_point(Z[, X[, Y]]).
The return was Z[, X] and so the extended return is also Z[, X[, Y]].
The original routine used masked array functions to deal with data and coord and did not rely on dispatching by numpy. So the extended routine follows this explicit approach. I just use import numpy as np and then np.ma.function() instead of the the original use of import numpy.ma as ma and then ma.function(). But I see that the import as ma statement still hangs around in the routine and could be deleted. I can do that if you want.

I understand that in case of 1-dimensional coordinates, it is o.k. to extend only longitudes, because you do a np.meshgrid afterwards with the latitudes. But if the coordinates are already 2-d, then the latitudes have to have the new dimension as well, so rowcoord is needed, because otherwise pcolor[mesh] or contourf won't work if X and Y have different shapes.

All that is pretty much in the spirit of the original function and 100% backward compatible. I only changed the way the cyclic point is calculated. You ask what happens if "someone is wanting a cyclic point with something other than degrees". What should that be? add_cyclic_point is for global grids. So there it should be degrees otherwise the plotting would not work, isn't it? I actually tried just repeating the first column but this did not work; I do not know why. The cyclic point has to be at least +360 to the first point.
The old way did not work for me because it assumed really regular grids. But cartopy works actually with a lot of non-regular grids if you give it the correct 2-dimensional coordinates.
I also added a check if there is already a cyclic point. This also assumes degrees.

My code is actually within my project ncvue, a GUI using cartopy for mapping. I tried quite some different output, e.g. from caspar and cuizinart. Quite some models are not on regular grids. But cartopy is still working once you have the 2-d coords. It works great with rotated grids, for example, and one only needs the 2-d coords. But these are definitely not regular anymore and so the old version by adding the distance between the last two columns does not work. +360 works, though.
The only thing that did not work is the case where the longitudes calculated from a rotated grid are close to but not exactly zero, for example 1e-5, and the variable is float32. Then 1e-5+360. gives 360 exactly. But plotting is not working in this case because somehow it realises that 360. is not at least 360 more than the first longitude. I had a bit of code in ncvue where I then pass to float64 for the longitudes:

if np.ma.allclose(self.ixxc[:, 0], self.ixxc[:, -1], atol=1.0e-4):
self.ixxc = self.ixxc.astype(float)
self.ixxc[:, -1] = self.ixxc[:, 0] + 360.

where self.ixxc are the cyclic longitudes. But this code is commented at the moment.

Kind regards,
Matthias

@greglucas
Copy link
Contributor

I like your idea of writing it as add_cyclic_point(Z [, X, ...]). Maybe you could use a *args somehow...? Although there is already a named kwarg that may cause trouble. We have a similar call here:

def vector_scalar_to_grid(src_crs, target_proj, regrid_shape, x, y, u, v,
*scalars, **kwargs):

I think that would be nice because then you could just iterate over the *args and return a tuple without worrying about whether coord/rowcoord were present or not.

cyclic points: I agree that the usual cycle is in degrees, but you could also have radians, or I think generally we are concerned with it being any projection that wraps where we want another data point on the far side (Mercator for example). My preference would be to make this keyword controllable. wrap=360 or something similar?

Masked arrays: The problem with calling the masked implementation is that np.ma.concatenate([ndarray, ndarray]) will return a MaskedArray even if you didn't pass one in. I'd prefer it to be something like this so you get the same class back.

concat = np.ma.concatenate if np.ma.is_masked(arr) else np.concatenate
cyclic_arr = concat((arr, arr[..., :1] + wrap), axis=-1)

Another option for your use case would be to call add_cyclic_point point twice I think, once with (Z, coord=lons2d) and then just call it again with (lat2d) if we don't want to add more args/kwargs now and instead just add the 2d capability...

@mcuntz
Copy link
Contributor Author

mcuntz commented May 14, 2021

Dear Greg,

I tested different things in cartopy and the coordinates have to be in degree. Radians do not work. I was wondering before how cartopy would figure out that radians or a high-resolution local field in degrees were given. It does not.

The other suggestions all lead to a version that would give different results than the old version. For example, a user might rely on the fact that the results of add_cyclic_point is a masked array. (I find your way pretty elegant, though.)

It looks to me that we mix enhancements of the routine and compatibility-breaking changes.
So I could, for example, simply add a few lines that do exactly the same for coords in 1D and in 2D. This keeps all the restrictions such as that the coordinates must be equally-spaced and that the output data and coord are masked arrays.
And I could add a new function such as add_cyclic_point_2d that makes things a bit different. Given like that, we could even skip the enhancement to 2D of `add_cyclic_point'. This function would actually be pretty similar to the one I currently have except for the masking. I would stay with the restrictions of degrees so I can also check if a cyclic point already exists.

So would this suit you, i.e. adding a new function rather than changing the old one? Any idea of a better name than add_cyclic_point_2d?

@greglucas
Copy link
Contributor

The coordinates need to be in whatever units your Projection is defined in, which are often in lat/lon, but certainly not always. For example, see the Geostationary projection example, although that doesn't help with your cyclic units. IMO it'd still be nice to not be reliant on a specific number, or at least add a kwarg to control the wrap point.

I also think that it is worth returning the same class you passed in. If I didn't have a MaskedArray initially, I don't want it to be turned into one through this function.

Might even want to draw on Basemap's version for some inspiration as well?
https://github.com/matplotlib/basemap/blob/a516e5b1de2f26d9adfcced1215bee7285c61d22/lib/mpl_toolkits/basemap/__init__.py#L5105-L5119

estr = 'coord.shape[{}] does not match'
estr += ' the size of the corresponding dimension of'
estr += ' the data array: coord.shape[{}] ='
estr += ' {}, data.shape[{}] = {}.'.format(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F523 '...'.format(...) has unused arguments at position(s): 3, 4

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turn these into f-strings instead.

estr = 'rowcoord.shape[{}] does not match'
estr += ' the size of the corresponding dimension of'
estr += ' the data array: rowcoord.shape[{}] ='
estr += ' {}, data.shape[{}] = {}.'.format(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F523 '...'.format(...) has unused arguments at position(s): 3, 4

@mcuntz
Copy link
Contributor Author

mcuntz commented Jun 5, 2021

Dear Greg,

I removed the additions to add_cyclic_point and made a new function add_cyclic. This allows to include all your suggestions while keeping backward compatibility. I inspired myself by basemap (thanks for the hint).

The routine keeps the same class on output as on input.

There is a cyclic=360 keyword for longitude/latitude in other than degrees.

The routine checks if a cyclic point is already present. I followed basemap by checking the difference to be less than 1e-4. There is a prec=1e-4 keyword which allows to adapt this check for example with longitude/latitude other than degrees.

There are two notable differences to basemap.

  1. The added point is just lon[0] + cyclic. I do not do modulo cyclic as does basemap: np.mod(lon[0] + cyclic, cyclic). I thought it might not work well with longitudes calculated by proj4, which are often not exactly 0, for example. Using a cyclic longitude of 365.625, for example, worked for me. Tell me if you think that we should rather make modulo cyclic.
  2. basemap always uses the axis keyword on the longitude and latitudes as it does on the data. My use cases would have longitudes/latitudes with different number of dimensions than the data, for example simply 1d. So I use the right-most axis for longitudes/latitudes in 1d or 2d. I use the axis keyword as basemap for longitudes/latitudes in higher dimensions.

The routine is documented and extensive tests are added.

@mcuntz
Copy link
Contributor Author

mcuntz commented Jun 8, 2021

One doctest output int but I had given float. cyclic was float before but it is int on default now to preserve lon if given as int.

pytest --pyargs cartopy -k cyclic works (37 tests passed).

pytest --pyargs cartopy fails in 6 other modules, probably because of a new shapely version (it worked before). I do not think that this is related to my new function add_cyclic, which is not used anywhere yet.

@mcuntz
Copy link
Contributor Author

mcuntz commented Aug 29, 2021

Dear Greg,

just wanted to ask about the progress on this after my holidays ;-)

Kind regards,
Matthias

@greglucas
Copy link
Contributor

@mcuntz, can you "@greglucas" to make sure I get the direct ping, sometimes these fall through my e-mails. My apologies for letting this drop off my radar. I will try and look at this early next week.

@mcuntz
Copy link
Contributor Author

mcuntz commented Sep 3, 2021

Thanks @greglucas :-)

Copy link
Contributor

@greglucas greglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will say you've definitely got the tests covered :) That is nice to add that many cases here! Thanks for looking into the basemap work as well, I think that is a nice addition to this.

Do we want to have both add_cyclic and add_cyclic_point? As a user I wouldn't know what the difference is without reading some long docstrings. I'm not sure I have a better suggestion, but something to think about how this function is distinguished from the other one. (I also get annoyed with Cartopy's transform_point and transform_points calls not having the same argument order and confusing them all the time)

Another suggestion is to go from coord, rowcoord to xcoord, ycoord instead. I think the x/y would be easier to understand at first glance.

Other stylistic/code suggestions inline.

import numpy as np
import numpy.ma as ma


__all__ = ['add_cyclic_point', 'add_cyclic']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to add this.


Returns
-------
The coordinate `lon` with a cyclic point added.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The coordinate `lon` with a cyclic point added.
The coordinate array `lon` with a cyclic point added.


Parameters
----------
lon: ndarray, optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
lon: ndarray, optional
lon : ndarray

numpydoc should have a space on each side of the colon. This applies to all of the docstrings here.


Parameters
----------
lon: ndarray, optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
lon: ndarray, optional
lon: ndarray

cyclic=360, prec=1e-4):
"""
Add a cyclic point to an array and optionally corresponding
column (`coord` ~ longitudes) and row coordinates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about changing all coord to xcoord, and say that the wrapping can only occur in x? Then instead of rowcoord you could have ycoord.

[ 18. 18. 18. 18. 18. 18. 18.]
[ 54. 54. 54. 54. 54. 54. 54.]]

Not adding a cyclic point if cyclic point detected in coord.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to warn here? I could see both sides, one where someone wants to call add_cyclic every time and they would be annoyed with a warning, but also someone else may want to call add_cyclic on their data and expect it to be expanded when it wasn't and now the returns aren't the shape they were expecting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not want to warn here. I wrote the code to use it in one of my projects, of course ncvue. And there I am your case 1, i.e. I call it every the time. I regard the second case as erroneous anyway.
I could imagine a compromise by making _has_cyclic a public function (calling it has_cyclic).

else:
caxis = -1
if coord.shape[caxis] != data.shape[axis]:
estr = 'coord.shape[{}] does not match'.format(caxis)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
estr = 'coord.shape[{}] does not match'.format(caxis)
estr = (f'coord.shape[{caxis}] does not match'

Use f-strings throughout the error messages and just continue the string onto the next line instead of += each time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I did not know f-strings yet. I changed the code and will try to switch over in future codes.

estr = 'coord.shape[{}] does not match'
estr += ' the size of the corresponding dimension of'
estr += ' the data array: coord.shape[{}] ='
estr += ' {}, data.shape[{}] = {}.'.format(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turn these into f-strings instead.

return data, coord
else:
return data, coord, rowcoord
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could remove this entire indented else block if you want since you've returned from within the if statement above.

ocoord = _add_cyclic_lon(coord, axis=caxis, cyclic=cyclic)
if rowcoord is None:
return odata, ocoord
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here for removing the else indent.

The coordinate with a cyclic point, only returned if the `coord`
keyword was supplied.
cyclic_rowcoord
The row coordinate with the last column of the cyclic axis duplicated, only returned

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E501 line too long (92 > 79 characters)

@mcuntz
Copy link
Contributor Author

mcuntz commented Sep 10, 2021

Dear @greglucas,

thanks for the thorough editing. I changed everything accordingly.

The only thing left now is the name of the function. Following your requests, the new functions does not give exactly the same outcome as the original add_cyclic_point for the same inputs. So we cannot remove add_cyclic_point.
We could call the new function add_cyclic_point_2d, which is pretty talkative and would probably be the least confusing to the users.
In other cases, one also often sees enhanced implementations called something like add_cyclic_point2, such as atan and atan2. Then the users normally know they should look at glance at the calling sequence or the docstring first.

I left the names coord and rowcoord even that I do not like them (o.k. rowcoord is from me trying to follow the logic of coord). But as you pointed out with transform_point and transform_points, it is annoying that they do not have the same arguments. So I sticked with the arguments of add_cyclic_point. You suggested xcoord and ycoord, which I also find much more self-explanatory than coord, for example. I personally would have simply taken x and y.

Almost there,
Matthias

@QuLogic
Copy link
Member

QuLogic commented Sep 10, 2021

This will need a rebase to fix conflicts, and that should get the tests running properly again.

@mcuntz
Copy link
Contributor Author

mcuntz commented Sep 12, 2021

@QuLogic
I did a rebase as requested following the explanations of gitwash as recommended in the Cartopy's Contribution Guidelines. Now there are 159 commits in 155 files with everything that happened on master since my initial commit in March.
I do not know if this is what was intended. But all tests work now ;-)

I could start over if you like, that means do a brand new PR from the current master with only the three files changed that I have touched. Then you would need to explain me exactly what to do next time to rebase correctly.

@greglucas
It looks like that we should finish soon. Otherwise even more commits will appear.

@greglucas
Copy link
Contributor

@mcuntz, I just rebased this onto master for you to remove the duplicate commits.

git fetch upstream
git rebase -i upstream/master

and then you can resolve the conflicts. I think I got all of these right, but you had a few commits that added something and then removed it again which were confusing. You will want to either rebase your local copy as I did above, or pull down this new version.

.. testsetup::
>>> from distutils.version import LooseVersion
>>> import numpy as np
>>> if LooseVersion(np.__version__) >= '1.14.0':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was removed.

[0. 1. 2. 3. 4. 5. 0.]
[0. 1. 2. 3. 4. 5. 0.]
[0. 1. 2. 3. 4. 5. 0.]]
[[ 0. 1. 2. 3. 4. 5. 0.]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes should be reverted.

Comment on lines 80 to 84
raise ValueError('The length of the coordinate does not match '
'the size of the corresponding dimension of '
'the data array: len(coord) = {}, '
'data.shape[{}] = {}.'.format(
len(coord), axis, data.shape[axis]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use f-strings.

Comment on lines 123 to 124
estr = 'The specified axis does not correspond to an array dimension.'
raise ValueError(estr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
estr = 'The specified axis does not correspond to an array dimension.'
raise ValueError(estr)
raise ValueError(
'The specified axis does not correspond to an array dimension.')

Adding a cyclic point to a data array, where the cyclic dimension is
the right-most dimension.
>>> import numpy as np
>>> np.set_printoptions(legacy='1.13')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this.

column (`coord` ~ longitudes) and row coordinates
(`rowcoord` ~ latitudes).

The call is `add_cyclic(data[, coord[, rowcoord]])`.
Copy link
Member

@QuLogic QuLogic Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what the signature already says, so not sure this line adds anything.

if coord is None:
return _add_cyclic_data(data, axis=axis)
# if coord
if (coord.ndim > 2):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra parentheses.

ocoord = _add_cyclic_lon(coord, axis=caxis, cyclic=cyclic)
if rowcoord is None:
return odata, ocoord
# if rowcoord
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete comment.

if rowcoord is None:
return odata, ocoord
# if rowcoord
if (rowcoord.ndim > 2):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra parentheses.

@@ -64,3 +65,322 @@ def test_invalid_coord_size(self):
def test_invalid_axis(self):
with pytest.raises(ValueError):
add_cyclic_point(self.data2d, axis=-3)


class Test_add_cyclic:
Copy link
Member

@QuLogic QuLogic Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignoring the existing test, no need to break pep8 naming here since you have the docstring.

Suggested change
class Test_add_cyclic:
class TestAddCyclic:

@mcuntz
Copy link
Contributor Author

mcuntz commented Sep 15, 2021

@greglucas I pulled and had to resolve one conflict in the docstring of add_cyclic_point, worked otherwise. I will do the rebase the next time.

I included all suggestions of @QuLogic .

I then also made the function has_cyclic public, i.e. without a leading underscore (as mentioned in a comment before). I added examples in the docstring and added a new test class.

So if you want to have the function name changed to add_cyclic_point_2d, add_cyclic_point_nd, or add_cyclic_point2, just tell me and I will do it quickly in the util.py and test_util.py. Or you can just do it yourself before merging. I am happy with all the names.
As said, I am also happy changing to xcoord and ycoord if you do not want to have the same keyword coord (as second parameter) as has add_cyclic_point (but cf. your example of transform_point and transform_points).

Copy link
Contributor

@greglucas greglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the "coord/rowcoord", or my previous suggestion of "xcoord/ycoord", how about calling it "x/y" instead? That would perhaps be even more obvious and also move us farther away from the coord confusion. You could put in some words saying the cycling point can only be in the "x" coordinate.

One other thought I had is if you'd want to put in a gallery example showing why someone would want to use this and how to do it in one or two of the cases?

On the rebase it looks like you got the double commits again. If you do: git rebase -i upstream/master you should be able to squash or drop all of the duplicate commits so you can select only the commits you want to keep and clean up any duplication.

@@ -6,10 +6,12 @@

import numpy as np
import numpy.ma as ma
from numpy.testing import assert_array_equal
from numpy.testing import assert_array_equal, assert_equal
import pytest

from cartopy.util import add_cyclic_point
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put all of the util imports on one line?

setup.py Outdated
@@ -26,11 +26,11 @@

import fnmatch
import os
import shutil

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E402 module level import not at top of file

setup.py Outdated
import subprocess
import warnings
from collections import defaultdict
from distutils.spawn import find_executable
from distutils.sysconfig import get_config_var
from sysconfig import get_config_var

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E402 module level import not at top of file

@mcuntz
Copy link
Contributor Author

mcuntz commented Oct 3, 2021

Dear @greglucas

I give up with this rebasing. I had squashed all the last commits into one and rebased on master. Looked fine but then I wanted to push and I had to resolve again all conflicts. And even Stickler-CI found issues in setup.py, which I did not touch at all :-(

In any case, I do not think that anybody (including me) is interested on the intermediate steps. So I thought that, if you agree, I would close this PR and open a new one committing just the final changes (once we decide them to be final). This would be the cleanest.
But I do not know of the work involved on your side. So if it is easiest for you staying with this PR, I can also stay with it; as you like it. In the latter case, could you tell me what to do with the Stickler-CI issue?

Otherwise I call coord x and rowcoord y now.
I also put in a Gallery example. This was a pretty cool suggestion because it made me discover the Sphinx gallery extension and it also made me find an error in case of 1-dimensional latitudes. It would be nice if you could read the text of the gallery example examples/scalar_data/wrapping_global.py. Hard to judge if I use too many or too little words ;-)

Thanks in advance,
Matthias

@greglucas
Copy link
Contributor

It looks like you are not doing an interactive rebase, and potentially going the wrong direction by bringing master into your new branch, rather than putting your commits onto master.

Rebasing is really powerful and helpful in lots of other ways, so I would suggest trying it if you have some patience left :) Atlassian has some good descriptions of why/how to do it: https://www.atlassian.com/git/tutorials/rewriting-history/git-rebase
Otherwise, yes you can certainly open up a new PR to start fresh on a new branch. You can always force push that single commit you have to this branch as well if you want.

add_cyclic_point(self.data2d, axis=-3)


class TestAddCyclic:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F811 redefinition of unused 'TestAddCyclic' from line 69

return npc.concatenate((data, data[tuple(slicer)]), axis=axis)


def _add_cyclic_lon(lon, axis=-1, cyclic=360):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F811 redefinition of unused '_add_cyclic_lon' from line 46

return _add_cyclic_data(data, axis=axis)
# if coord was given
if coord.ndim > 2:
caxis = axis

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F841 local variable 'caxis' is assigned to but never used

if coord.ndim > 2:
caxis = axis
else:
return_value = new_data, new_coord

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F821 undefined name 'new_data'
F821 undefined name 'new_coord'

return return_value


def _add_cyclic_data(data, axis=-1):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F811 redefinition of unused '_add_cyclic_data' from line 124

return npc.concatenate((x, cx), axis=axis)


def has_cyclic(x, axis=-1, cyclic=360, precision=1e-4):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F811 redefinition of unused 'has_cyclic' from line 183

return False


def add_cyclic(data, x=None, y=None, axis=-1,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F811 redefinition of unused 'add_cyclic' from line 238

@mcuntz
Copy link
Contributor Author

mcuntz commented Oct 3, 2021

Hey @greglucas

o.k. I went through the rebase and basically squashed all commits into one. I read all your recommendations, did git fetch upstream, and git rebase -i upstream/master. I had to resolve conflicts on basically each single commit (this is what I have not understood). I was ending up with an erroneous final version :-( And I had to add --force to the git push afterwards. I hesitated quite a while to add --force.

Anyway, at least on github it looks that I have one single commit now that is rebased on the current master. All test passed. If you are fine with the text in examples/scalar_data/wrapping_global.py then I call it a day and decide that I have finished and think the feature is ready to be added :-)

Kind regards,
Matthias

Copy link
Contributor

@greglucas greglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Well done. This is much easier to look at and review :) thanks.

Just some minor comments here.

I like the example. I changed up some of the wording just to give a suggestion, feel free to play with that some more. I think the general idea is you want to get across what "cyclic" means in this sense. I'm still not sure I understand it from the text alone.

In my mind, these are the things to highlight in some form:

  • data has the first column repeated and appended to the end of the data array
  • x repeats the first column and adds the wrap amount (i.e. more than just a repeat)
  • y gets expanded to the proper shape

(column may be the wrong term above, but I find it easier in a simplified example to just focus on the most common use-case and not to dwell on corner cases like n-dimensional arrays, that could be another example if needed)

@@ -0,0 +1,91 @@
"""
Wrap around global data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Wrap around global data
Adding a cyclic point to help with global wraps

I think it would help to have "cyclic" in the title since that is what you're demonstrating.

Comment on lines 5 to 10
Matplotlib/Cartopy use Cartesian coordinates if you tell contourf or
pcolor(mesh) that data is in the PlateCarree projected coordinate system. 350
degrees longitude, for example, is hence not just 10 degrees away from 0
degrees as in spherical coordinates. contourf will not plot data between the
last and the first longitude (pcolor and pcolormesh depend on the shading
keyword).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Matplotlib/Cartopy use Cartesian coordinates if you tell contourf or
pcolor(mesh) that data is in the PlateCarree projected coordinate system. 350
degrees longitude, for example, is hence not just 10 degrees away from 0
degrees as in spherical coordinates. contourf will not plot data between the
last and the first longitude (pcolor and pcolormesh depend on the shading
keyword).
Cartopy represents data in Cartesian projected coordinates, meaning
that 350 degrees longitude, is not just 10 degrees away from 0
degrees as it is when represented in spherical coordinates. This means that
the plotting methods will not plot data between the last and the first longitude.

Comment on lines 12 to 15
One can add an extra point to the data and the longitude/latitudes so that this
gap gets closed. The routine `add_cyclic` repeats the last data column and adds
the first longitude to the end of the longitude array so that the data values
at the last longitudes will be plotted up to the first longitude.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
One can add an extra point to the data and the longitude/latitudes so that this
gap gets closed. The routine `add_cyclic` repeats the last data column and adds
the first longitude to the end of the longitude array so that the data values
at the last longitudes will be plotted up to the first longitude.
To help with this, the data and longitude/latitude coordinate arrays can be expanded with
a cyclic point to close this gap. The routine `add_cyclic` repeats the last data column and adds
the first longitude to the end of the longitude array so that the data values
at the ending longitudes will be closed to the wrap point.

nlat = 12
# 7.5, 22.5, ..., 337.5, 352.5
dlon = 360//nlon
lon = np.arange(0+dlon/2., 360, dlon)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
lon = np.arange(0+dlon/2., 360, dlon)
lon = np.linspace(7.5, 352.5, 24)

I find linspace easier to understand if you have a specific output in mind. Ditto down below.

Comment on lines 74 to 75
# add_cyclic works also in n-dimensions.
# Cyclic points are then added to data, longitudes, and latitudes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# add_cyclic works also in n-dimensions.
# Cyclic points are then added to data, longitudes, and latitudes
# add_cyclic also works with 2-dimensional data
# Cyclic points are added to data, longitudes, and latitudes to
# ensure the dimensions of the returned arrays are all the same shape.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You aren't demonstrating the n-dimensional case, so I'd just emphasize the 2d here for example purposes.

assert_array_equal(c_data, r_data)
assert_array_equal(c_lons, r_lons)

def test_data_and_x_2d(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general comment, there is a lot of repeated code in the tests here. I would find it a little more clear to parameterize these and make explicit calls with (x, y, lat, expected_shape, expected_data, ...) and having a single test function that gets repeatedly called with different inputs.
https://docs.pytest.org/en/6.2.x/parametrize.html#pytest-mark-parametrize-parametrizing-test-functions

But, I won't hold it up for this either.

c_data = add_cyclic(new_data)
r_data = ma.concatenate((self.data2d, self.data2d[:, :1]), axis=1)
assert_array_equal(c_data, r_data)
assert_equal(ma.is_masked(c_data), True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert_equal(ma.is_masked(c_data), True)
assert ma.is_masked(c_data)

If you are doing comparisons to booleans, just assert them, or assert not ...

Comment on lines 445 to 448
ido = has_cyclic(self.clons, axis=0)
idont = has_cyclic(self.lons, axis=0)
assert_equal(ido, True)
assert_equal(idont, False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ido = has_cyclic(self.clons, axis=0)
idont = has_cyclic(self.lons, axis=0)
assert_equal(ido, True)
assert_equal(idont, False)
assert has_cyclic(self.clons, axis=0)
assert not has_cyclic(self.lons, axis=0)

# if y was given
return data, x, y
# if not has_cyclic, add cyclic points to data and x
odata = _add_cyclic_data(data, axis=axis)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
odata = _add_cyclic_data(data, axis=axis)
out_data = _add_cyclic_data(data, axis=axis)

I think you meant out here? I'd be explicit in the naming because it isn't very long. Same for the other "o"s

@mcuntz
Copy link
Contributor Author

mcuntz commented Oct 17, 2021

Dear @greglucas ,

I changed everything as suggested, example (title and explanation, only very slightly changing your suggestions), code (odata -> out_data, etc.), and tests. I cleaned the tests and even use parameterised tests with has_cyclic :-)

I use class variables for the expected outputs for the tests of add_cyclic. This is much more concise. Good suggestion. If I would parameterise the tests, then I would not really test the default keywords. So I decided against it for that routine.

I squashed to have a single commit again. And I rebased against the master as well.

Copy link
Contributor

@greglucas greglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice updates! I think this is good to go. I have one more minor comment which is to avoid calling eval() as that has some security implications.


@pytest.mark.parametrize(
"lon, clon",
[("self.lons", "self.c_lons"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[("self.lons", "self.c_lons"),
[(self.lons, self.c_lons),

Can you avoid the eval(string) by passing in the objects directly?

…nsional longitudes and latitudes. It checks if a cyclic point is already present using the function has_cyclic.
@mcuntz
Copy link
Contributor Author

mcuntz commented Oct 20, 2021

Dear @greglucas ,

your suggestion using the object self.lons, for example, does not work because, as far as I have understood, the decorator is read at import time and not at runtime so Python raises a NameError that self is not defined.

The easiest way to remove eval() was to remove the setup_class method and just "hardcode" the longitude and latitudes into the class. I do not know if this is elegant or not but I have not understood the pytest.fixtures, which seems to be another way to achieve it.

@mcuntz
Copy link
Contributor Author

mcuntz commented Nov 4, 2021

Dear @greglucas

just to ping you again so that the master does not move away too much and I have to rebase again ;-)

Cheers,
Matthias

@greglucas
Copy link
Contributor

Thanks for the reminder!

I think this is good to go then, we can update some of the tests and things in follow ups if we want to clean any of the evals up.

Thanks for your patience with the reviews, @mcuntz! A very nice addition to add a multi-dimensional wrapping coordinate helper and examples of how to use it.

@greglucas greglucas merged commit e88eebc into SciTools:master Nov 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants