Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geometries Contribution (and github experimentation) #115

Merged
merged 43 commits into from
Nov 14, 2018
Merged

Geometries Contribution (and github experimentation) #115

merged 43 commits into from
Nov 14, 2018

Conversation

dblodgett-usgs
Copy link
Contributor

I've reconciled review from #109. I also formatted my contribution so each sentence is on it's own line. This will allow us to comment on each line individually.

Notes regarding reconciliation of @JonathanGregory's review.

Accepted suggestion to dropping "spatial" on geometry throughout.

Accepted most changes moving description of a geometry into other parts of the document. Reference to "other geometry types" seemed out of place and could just be dropped.

Accepted removing last node equals first requirement. I want to point out that nothing stops an implementation from using last node = first data, so removing nodes from typical simple features polygons is not necessary. I've added a sentence to that affect so as not to confuse people.

Hesitantly accept removing the first two sentences introducing how geometry relates to the instance dimension in DSG. I think the lack of context will confuse some, but don't disagree that it is not in line with the ammount of context given in the rest of the spec. This conciseness is a quality of CF, but also a detractor... so accept.

Largely accepted changes to paragraphs about the geometry container variable, geometry_type, node_coordinates, and cf_role.

Accepting suggestion to allow the node_count to be dropped if only one geometry. Added a clause to that affect.

Removed explanation of why polygon order is different. There are a few other reasons and it's not really relavent in the context of the spec.

Rejecting change to have polygons with holes be considered multipolygons. A multipolygon represents multiple features, a polygon a single feature. Polyginal features can have holes. So we need a regular polygon to be allowed to have holes.

Accept moving / removing the encoding section. In retrospect this section is not in the style of the rest of the CF spec. As much as I wish CF was a little easier to parse, through repitition and some explanation, that's not the style of the rest of the document.

I'll accept removal of the crs and grid_mapping details. This is not the point of the geometry contribution and not worth holding things up for. However, we need to allow encoding only geometry--without associated variables that would otherwise contain a grid_mapping. I've added that the grid_mapping can be carried by the geometry_container.

Accept suggestion to move the appendix E example into chapter 7.

There was a suggestion to add an additional geometry to the multipolygon example. There will be other examples available outside of the spec. I'd prefer to keep the multipolygon as a single multipolygon as it is already quite large.

ch07.adoc Outdated
[[spatial-geometries, Section 7.5, "Spatial Geometries"]]
=== Geometries

For many geospatial applications, data values are associated with a geometry, which is a spatial representation of a real-world feature, for instance a time-series of areal average precipitation over a watershed.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JonathanGregory and @davidhassell - we can now comment on each line individually as needed.

@JonathanGregory
Copy link
Contributor

Dear Dave

Many thanks for being so accommodating of my comments! I think David and I should discuss the new version, since we're in the same place, so we can produce a single reply to you.

I appreciate that you aren't entirely comfortable with the terseness of the typical CF style and I certainly agree that we shouldn't be so sparing of words that we're not clear what we mean. We should spell them out if necessary.

As I said before, I accept the distinction in principle between a multipolygon and a polygon with holes. In practice, how could the CF checker be sure a geometry was correctly classified? Would it correct to say that if there is more than one exterior polygon in a geometry, it must be a multipolygon? Presumably you can also have multipolygons with holes - yes?

Regarding the process, I agree it's easier to comment if you break it up into lines. That was a good idea.

I continue to think I don't think we need to start a new revision to the document itself, leading to a pull request, until it's fairly settled. For many changes, which aren't nearly as large as this one, it'd be fine to discuss text in postings to the issue (in MarkDown), like we have done up to now with trac. For wordsmithing of large changes, I agree that Google docs is a possibility, if we link the doc in postings on the issue. I would argue that a new doc should be started each time there is a "revision" (not closely defined) of the draft, so that the whole history can be seen - or can Google docs have version control, perhaps? For the same reason of traceability, I feel that the same issue should be used from beginning to end, again as we have done with trac tickets.

Best wishes

Jonathan

@dblodgett-usgs
Copy link
Contributor Author

Dear Jonathan,

Thanks for your thorough consideration through this process. I really appreciate the collegiality and all the time you've spent.

Would it correct to say that if there is more than one exterior polygon in a geometry, it must be a multipolygon?

Yep. That is the distinction.

Google docs does have a "history" that's quite good. You can view who changed what with a nice diff back through time.

Agreed that an issue should be maintained.
This (pull request) is a bit of a special case since we are using it as a vehicle for figuring out github process. If you look at the end of #109 you'll see my references to it from here. -- linking between things on github allows for a bit more atomic focus on issues, which can be a good thing.

ch07.adoc Outdated

[[complete-multipolygon-example]]
[caption="Example 7.15. "]
.A multipolygon with holes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this for a smaller CDL multipolygon example that includes two features and a hole? First feature has two parts with first part including a hole. Second feature has one part.

dimensions:
  node = 12 ;
  feature = 2 ;
  part = 4 ;
variables:
  double x(node) ;
    x:units = "degrees_east" ;
    x:standard_name = "longitude" ;
    x:cf_role = "geometry_x_node" ;
  double y(node) ;
    y:units = "degrees_north" ;
    y:standard_name = "latitude" ;
    y:cf_role = "geometry_y_node" ;
  float geometry_container ;
    geometry_container:geometry_type = "multipolygon" ;
    geometry_container:node_count = "node_count" ;
    geometry_container:node_coordinates = "x y" ;
    geometry_container:grid_mapping = "crs" ;
    geometry_container:part_node_count = "part_node_count" ;
    geometry_container:interior_ring = "interior_ring" ;
  int node_count(feature) ;
  int part_node_count(part) ;
  int interior_ring(part) ;
  float crs ;
    crs:grid_mapping_name = "latitude_longitude" ;
    crs:semi_major_axis = 6378137. ;
    crs:inverse_flattening = 298.257223563 ;
    crs:longitude_of_prime_meridian = 0. ;
// global attributes:
  :Conventions = "CF-1.8" ;
data:
 x = 0, 10, 20, 5, 10, 15, 0, 10, 20, 30, 40, 50 ;
 y = 0, 15, 0, 5, 10, 5, 20, 35, 20, 0, 15, 0 ;
 geometry_container = 0. ;
 node_count = 9, 3 ;
 part_node_count = 3, 3, 3, 3 ;
 interior_ring = 0, 1, 0, 0 ;
 crs = 0. ;

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example with correct node order for exterior vs interior.

dimensions:
  node = 12 ;
  feature = 2 ;
  part = 4 ;
variables:
  double x(node) ;
    x:units = "degrees_east" ;
    x:standard_name = "longitude" ;
    x:cf_role = "geometry_x_node" ;
  double y(node) ;
    y:units = "degrees_north" ;
    y:standard_name = "latitude" ;
    y:cf_role = "geometry_y_node" ;
  float geometry_container ;
    geometry_container:geometry_type = "multipolygon" ;
    geometry_container:node_count = "node_count" ;
    geometry_container:node_coordinates = "x y" ;
    geometry_container:grid_mapping = "crs" ;
    geometry_container:part_node_count = "part_node_count" ;
    geometry_container:interior_ring = "interior_ring" ;
  int node_count(feature) ;
  int part_node_count(part) ;
  int interior_ring(part) ;
  float crs ;
    crs:grid_mapping_name = "latitude_longitude" ;
    crs:semi_major_axis = 6378137. ;
    crs:inverse_flattening = 298.257223563 ;
    crs:longitude_of_prime_meridian = 0. ;
// global attributes:
  :Conventions = "CF-1.8" ;
data:
 x = 20, 10, 0, 5, 10, 15, 20, 10, 0, 50, 40, 30 ;
 y = 0, 15, 0, 5, 10, 5, 20, 35, 20, 0, 15, 0 ;
 geometry_container = 0. ;
 node_count = 9, 3 ;
 part_node_count = 3, 3, 3, 3 ;
 interior_ring = 0, 1, 0, 0 ;
 crs = 0. ;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'll get it in here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... This example needs a data variable ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dblodgett-usgs incorporated a data variable into the example. Scroll to the end of the latest version (as I am typing this) here: https://github.com/dblodgett-usgs/cf-conventions/blob/7768e33e7edff459482e8ef8057ea6b8e015c9eb/ch07.adoc

ch07.adoc Outdated
@@ -544,3 +544,164 @@ data: // time coordinates translated to date/time format
"2000-8-1 6:00:00", "2000-9-1 6:00:00" ;
----
====

[[spatial-geometries, Section 7.5, "Spatial Geometries"]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we call it just "Geometries" (without "Spatial"), as you have done elsewhere? Or "Simple geometries", like you originally called it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I missed this. Will call it geometries.

@davidhassell
Copy link
Contributor

@JonathanGregory and I had a good discussion about simple geometries, the results of which I will report as in-line comments in the new chapter 7 text ...

ch07.adoc Outdated

For many geospatial applications, data values are associated with a geometry, which is a spatial representation of a real-world feature, for instance a time-series of areal average precipitation over a watershed.
Polygonal cells with an arbitrary number of vertices can be described using <<cell-boundaries>>, but in that case every cell must have the same number of vertices.
In contrast, each geometry associated with a given data variable may have a different number of nodes, the geometries may be lines (as alternatives to points and polygons), and they may be __multipart__ i.e include several disjoint parts.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to point out that points and lines and are not really "cell bounds" in the chapter 7.1 sense, but they describe "cell" extent in a different way. Perhaps adding something about "line and point geometries describe cells that don't have an area in the traditional, polygonal sense" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Will work something in.

ch07.adoc Outdated
Note that because multipoint geometries always have a single node per part, the **`part_node_count`** is not required.
The single dimension of the part node count variable should equal the total number of parts in all the geometries.

For __polygon__ and __multipolygon__ geometries with holes, the geometry container variable must have an **`interior_ring`** attribute that contains the name of a variable that indicates if the polygon parts are interior rings (i.e., holes) or not.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With regards the difference between polygon and multipolygon, a use-case to consider is if the instance dimension (in the example given) is an unlimited dimension to allow data to come in gradually. In this case, you might not know in advance if each incoming-instance is a polygon or a multipolygon, so you would be forced to use multipolygon, even if some of the geometries consisted of just one polygon (!). So on those grounds, I don't think that the distinction between polygon and multipolygon is useful, and just polygon should suffice, I think. The same argument applies on multipoint and multiline.

Copy link
Contributor Author

@dblodgett-usgs dblodgett-usgs Jun 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree here. Saying that a geometry is a polygon (or multi) type when you declare it allows an implementation to make some pretty important assumptions about the data it is going to be working with, making the distinction very useful.

The potential of an unlimited instance dimension, while possible, is an edge case as far as I'm concerned. Even if it were to be required, within a dataset, there would be a policy regarding support for multipolygons.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid that I don't see how specifying multipolygon helps. It is possible that at least one cell of a multipolygon container contains a "non-multi" polygon, so the geometry for each cell needs to be ascertained by inspecting the part_node_count and interior_ring attributes, regardless.

Is it right that a geometry is a multipolygon if it has two or more parts that are exterior rings, as determined by the interior_ring attribute or assumed to be exterior in this attribute's absence?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm not being very eloquent... There are a few things at play here.

The interior_ring attribute indicates whether a given ring is interior or exterior. While this could be inferred through inspection of the node winding direction, in most cases it is much more convenient to have this in an attribute.

So yes, it is right that:

a geometry is a multipolygon if it has two or more parts that are exterior rings

but this is not quite the right interpretation.

as determined by the interior_ring attribute.

interior_ring is an indication of whether a ring is a hole or not. It is very convenient to have this flag when reading data into memory where you don't want to test the node winding direction to tell if a ring is a hole or not, this is true for both single and multi polygons. interior_ring is not meant to help tell between single and multi geometry types but it could be used to infer something about the distinction on a geometry by geometry basis.

That said, having the distinction between single and multi types is useful for two reasons (and probably more that I'm not thinking of).

  1. Some software only needs to support single polygon types. They need to be able to tell that the data they are about to read is unambiguously a type they are designed to handle.
  2. Typical geometry objects (in code / in memory) need to be instantiated as single or multi types. It would be very problematic to have to scan a file to see if it's truly a multi type or not.

Because of this, if we were to say, "just assume everything in CF is a multi geometry because the multi type is a super class of the single type", everyone would have to support the complexity of multi types. We would also run into situations where software that only works with single geometry types would be in a bind when opening and attempting to read or write data in CF. -- So at the end of the day, this is about being consistent and interoperable with practices in other similar encodings and specifications and playing nice with people who need to implement support for a NetCDF-CF encoding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we combine polygon+multipolygon:

  1. Do we need a variable on the instance dimension indicating whether the geometry is multipart? Or shall software figure it out by looking at, e.g., node count and part node count?
  2. Shall we combine line and multiline into "line"? I think so.
  3. Shall we combine point and multipoint into "point"? I don't think so. To me, points and multipoints are more distinct than line/multiline and polygon/multipolygon, as in I have a hard time coming up with use cases for mixing them. I note that Esri offers point, multipoint, line, and polygon classes, for what I assume are good reasons (I've asked and will report back if I get a response).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: 1. -- I think it's easy enough to assume multipart and if you don't get multipart, cool! So software can figure it out.
Re: 2. -- yes. At the end of the day this distinction shouldn't matter. Just treat everything as if it's multi.
Re: 3. -- I might lean toward everything being multipoint (which can have a single point per geometry) in this part of the spec and if you want to do single point, it is basically the DSG "point" type. Make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me

ch07.adoc Outdated
The **`node_coordinates`** attribute contains the blank delimited names of the variables that contain geometry node coordinates (one variable for each spatial dimension).


The geometry node coordinate variables must each have a **`cf_role`** attribute whose allowable values are __geometry_x_node__, __geometry_y_node__, and __geometry_z_node__.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re. cf_role : I originally didn't like this because a variable was already identified as being a node coordinate variable by virtue of it being named by a node_coordinates attribute of a geometry container variable. However, I subsequently realised that we need to know which of the named variables contains the X (or Y or Z) node coordinates.

How about a metadata approach which says that the node coordinate variables must be described by axis attributes, and standard names are recommended:

int geometry_container ;
    geometry_container:geometry_type = "line" ;
    geometry_container:node_count = "node_count" ;
    geometry_container:node_coordinates = "x y" ;
double x(node) ;
    x:units = "degrees_east" ;
    x:standard_name = "longitude" ;
    x:axis = "X" ; 
double y(node) ;
    y:units = "degrees_north" ;
    y:standard_name = "latitude" ;
    y:axis = "Y" ;

As node coordinate variables are not [auxiliary] coordinate variables, there would be no confusion with the existing standarised use of the axis attribute.

Coordinate variables which correspond to the node coordinates must be present should also be forced to have axis attributes with the usual meaning. There are two reasons for this - 1) for software that can not interpret the geometries, a representative coordinate location is still useful for analysis/plotting and 2) Allowing cells to be defined by bounds alone is new to CF and so, whilst that is something that we may want to explore in the future, I think that going down this route would only delay this ticket unnecessarily.

What value the representative coordinates should have is entirely up to the creator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these are not "axes", but rather just a collection of nodes that form polygons, I don't think the axis pattern applies. We did discuss it a while back and decided use of cf_role is more appropriate. This is in line with the use of cf_role: projection_x_coordinate etc. when using projected coordinate variables.

I did mean to have a representative lat and lon point for each geometry which I think is what you are getting at in your comments about coordinate variables?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that cf_role has only been used by DSGs for timeseries_id, profile_id, and trajectory_id variables which are not otherwise linked to their data variable. A projected coordinate variable has a standard_name of projected_[xy]_coordinate. My main point is that the node coordinate variable are linked to their data variable so their role is not in question, but they still require particular metadata to understand them (just like a latitude coordinate variable must have certain metadata).

Re. representative lat and lon variables - that's what I had in mind, thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I see now. Sorry for the confusion. A while back, I think we had used standard_name for this and were told that cf_role was more appropriate. I see what you mean about needing particular metadata to clarify and how axis makes sense. I guess I can accept the "axis" is in reference to the coordinate-frame axis, which makes a bit more sense. Will update the text. See what you think after it's checked in.

ch07.adoc Outdated
For many geospatial applications, data values are associated with a geometry, which is a spatial representation of a real-world feature, for instance a time-series of areal average precipitation over a watershed.
Polygonal cells with an arbitrary number of vertices can be described using <<cell-boundaries>>, but in that case every cell must have the same number of vertices and must be single polygon rings.
In contrast, each geometry may have a different number of nodes, the geometries may be lines (as alternatives to points and polygons), and they may be __multipart__ i.e include several disjoint parts.
While line and point geometries don't describe an interval along a dimension as the traditional cell bounds described above do, they do describe the extent of a geometry or real-world feature so are included in this section.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JonathanGregory and @davidhassell - How's this? Not sure what words to use to be consistent with the rest of CF, but I think this is what we were getting at?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dblodgett-usgs I think this is looking good! Thanks. Some minor comments from my latest read

  • Example 7.15: Needs a data variable to complete it
  • It would be handy to have a table defining the different types: line, multiline, point, multipolygon, etc. (Should this be in-line or in an appendix?)
  • Appendix A needs updating for the new attributes (or has this been done elsewhere, like the the grid mapping variables?)

All the best, David

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidhassell Example 7.15 was meant in part to demonstrate that geometry could be encoded without associated data variables. For example, it shows that you would need a grid_mapping attribute on the geometry container variable if there isn't a data variable providing that attribute. Do you think there's merit in showing an example of geometry without a data variable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@twhiteaker I see, thanks. However, storing coordinates (a "domain") without a data variable is not currently allowed by CF, and I think that to introduce it would need a ticket and discussion in its own right - for example, how would you encode the "usual" regular lat-lon grid without a data variable to bind it all together?

This is similar in principle to insisting that there are also representative coordinates for each cell --- bounds without coordinates are also not currently possible.

All the best,

David

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidhassell Seems like it's better to get this proposal accepted and take broader steps like standalone geometries later then. I'll get a revised example 7.15 into the pull request ASAP.

ch07.adoc Outdated
geometry_container:node_count = "node_count" ;
geometry_container:node_coordinates = "x y" ;
geometry_container:grid_mapping = "crs" ;
geometry_container:coordinates = "lat lon"
Copy link
Contributor Author

@dblodgett-usgs dblodgett-usgs Jun 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since no data variable, I've added coordinates to the geometry_container. Will add a little note about this where we say the grid_mapping can be put on the geometry container.

@dblodgett-usgs
Copy link
Contributor Author

Think I got all the updates needed for the use of axis vs cf_role. See here for the diff: e4ef4b2

@graybeal
Copy link

A few comments on the process and (a bit of) content, as an 'outsider' who just read the 90 or so rather complicated messages in an hour, in my email:

  • Brilliant discussion of the technical details, and experimentation with the process. I could not possibly have read that amount of content, and arbitrarily checked diffs of interest, in any other form or forum. For this reason alone, I think the switch to github review/comment/update is a tremendous improvement.
  • There is great complexity to what github supports, which means a learning curve for newbies, for sure. But! The learning curve will teach them about best practices that can be used their entire career -- this is the best standard for collaborative threaded development of changes to content. Therefore...
  • The mixing of pull requests and line-by-line comments does not bother me. Each can be appropriate in its place, and learning both styles will not be impossible.
  • The adoption of Google doc pre-editing of changes, while leveraging another great tool and therefore appropriate as truly needed, is not my favorite thing to do, because it is so external, less under our control, and actually pretty difficult to easily follow changes over time (e.g., from time X to Y). If I want to see exactly where a particular concept got introduced in this thread, by whom, and how it involved, it would be pretty trivial (and pointers are easily shared); but very time-consuming to do the same thing in Google docs. Much as I love them.
  • Technically, nice to see such sophisticated consideration of this rather advanced feature (these rather advanced features). Not enough time to spend 3 hours validating everything, but the progression and resolutions seemed well justified.

Hope that helps, and that I haven't just accidentally created and merged a pull request that will merge all my comments directly into the standard. :-)

@martinjuckes
Copy link
Contributor

Tim has redirected me here from an earlier thread ( twhiteaker/CFGeom#75 (comment) ) .. one point I raised has already been resolved by Jonathan's review .. but I'd like to suggest adding a reference to the OGC/ISO Simple Features model near the start of the section on geometries.

I also have a question about coordinate transformations. I made a suggestion here (twhiteaker/CFGeom#74 (comment) ) that the CRS should be placed differently (placing grid_mapping in the geometry container rather than someData ) so that the information that a line between two points is straight in a specific CRS could be preserved when the coordinates used to represent the end points are transformed. David has responded that this is not the approach used in GIS .. instead lines need to be in-filled to preserve an appropriate degree of straightness in the representation of the line. I don't feel strongly about this, but if the approach is to follow GIS, I would like to see some explanation in the text to at least warn people of the complication that arises when coordinates are transformed.

@dblodgett-usgs
Copy link
Contributor Author

I've got no problem adding a reference to the OGC/ISO Simple Features, but am curious how @JonathanGregory feels about it -- it's not strictly necessary.

Are you suggesting that we not allow grid_mapping to be on a variable? That would not be in line with previous practice of CF. We DO allow, "A grid_mapping and coordinates attribute, which would usually by carried by a data variable, can be carried by the geometry container variable." So are already allowing what you suggest to be done.

We could add the sentence, "The connections between nodes should be assumed straight in the coordinate system the nodes are defined in." somewhere in the second paragraph of the new geometry section?

@martinjuckes
Copy link
Contributor

Sorry, I didn't spot that the change I was asking for with regard to CRS has already been made in https://github.com/dblodgett-usgs/cf-conventions/blob/simpleGeo/ch07.adoc It might be worth adding the sentence you suggest in your last paragraph, and perhaps adding ", which may differ from the coordinate system used to express the locations of the points."

@JonathanGregory
Copy link
Contributor

JonathanGregory commented Jul 11, 2017 via email

@martinjuckes
Copy link
Contributor

Sorry to be adding comments at this stage, but I've tried using the proposed standards for some regions used in IPCC assessment reports, and it works well .. all clear and well structured .. up to the point when I want to deal with the labels and descriptions of the regions (e.g. label="EAS", description="East Asia"). I could create string arrays for each attribute, but attaching attributes to polygons is a reasonably common requirement, so perhaps there should be something in the standard. This could be of the form of specific attributes attached to geometry_container, such as instance_label, instance_description, which would contain the name of appropriate variables, or it could be more generic with an attribute instance_attributes containing a list of variables, each of which contains values for an instance attribute, and the attribute name:

geometry_container:instance_attributes = "fi fo fum";

char fi(instance,len_lab):
  fi:name= "label";
char fo(instance,len_desc):
  fo:name="description";
char fum(instance,2,len_mask):
  fum:name="mask";
  fum:comment="IPCC Regional diagnostics should be applied to a masked portion or, if there is more that one non-blank mask specified, portions of the region:  "

data:
   fi = "EAS", "ENA", "MED";
   fo = "East Asia", "East North America", "Mediterranean";
   fum = "land", "", "land", "", "land", "";

Personally, I would prefer the 2nd, more generic option. The aim is to make the link between fi etc and the geometry explicit, rather than relying on inference from the use of the instance dimension. There is an additional complication in that the IPCC definition of the "MED" region is in terms of the land area within a specified geometry, and the precise location of the land differs between models -- but I don't think this last point needs to be addressed in the standard anywhere.

@dblodgett-usgs
Copy link
Contributor Author

I see what you are getting at, and I think most of what you need is in the discrete sampling geometries portion of the spec? I would like to consider this separate from the geometries topic if that's ok.

Assuming you are creating a DSG timeSeries featuretype and attaching geometries along the instance dimension, you should have the timeseries_id. I'm pretty sure that the DSG spec already says you should infer that things defined on the instance dimension are essentially attributes of the instances. So, this is already taken care of?

@dblodgett-usgs
Copy link
Contributor Author

From my perspective, we're good. The only thing left is to add Tim and my names to the spec which, I don't feel is really needed, but if the community wants to I think it would be ok.

@JonathanGregory
Copy link
Contributor

Yes, Dave and Tim should added as additional authors of the CF convention, at the start of the document. That's what we always do when substantial enhancements are made. Thanks.

twhiteaker and others added 2 commits May 9, 2018 16:25
Replace axis with bounds for coordinate variables related to geometry…
@dblodgett-usgs
Copy link
Contributor Author

Can we go ahead and merge this? I think we've checked all the boxes.

@marqh marqh self-assigned this Jun 20, 2018
@marqh
Copy link
Member

marqh commented Jun 20, 2018

@twhiteaker @JonathanGregory @davidhassell

please may you add to your 'review' and explicitly indicate whether you are in support of the proposal at commit
033cc25

thank you
mark

Copy link
Member

@marqh marqh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support this pull request at commit

033cc25

I confirm that the commits up to
f9cef4a

are suitable for adoption, in my view

Copy link
Contributor

@twhiteaker twhiteaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve these changes.

@twhiteaker
Copy link
Contributor

FYI, how to review.

@dblodgett-usgs
Copy link
Contributor Author

@davidhassell and @JonathanGregory -- can you register your approval for merging this change or let us know what else we should discuss?

@JonathanGregory
Copy link
Contributor

I have tried to add a review (formally) but I couldn't see how to do it. Following the instructions which Tim pointed to above, I added a comment, but I didn't see anything which said "Start a review" as it mentions. I have two comments:

  • The new statement about the bounds of coordinates naming the nodes should have corresponding requirements in the conformance document. Maybe you've done that?
  • It should be "as" rather than "at" in "Add Tim Whiteaker and Dave Blodgett at authors".

Thanks for doing this
Jonathan

@davidhassell
Copy link
Contributor

I worked out how to submit a review (there's a section "Submitting your review" towards the bottom of https://help.github.com/articles/reviewing-proposed-changes-in-a-pull-request/) and have approved the changes.

I was happy to do this, as it seems like convenient way for those involved in the discussion to sign it off, but I would also like to see a comment in the issue (Trac 164 in this case) saying that consensus has been reached so we can start the 3 week countdown.

I'm not sure about merging this pull request into master at this time. I don't think we agreed on this procedure last week, and how best to go about merging changes is currently being discussed on github #130

@JonathanGregory The conformance work has been done, in a pull request to the Conformance repository: cf-convention/Conformance@f687062?short_path=a2980ec#diff-a2980ec1fbd787c60bcc53548dd58680 (which, I realise, is interesting, as we now have two seperate pull requests for the same proposal, which do not link to each other, as far as I can see)

twhiteaker and others added 5 commits June 25, 2018 09:07
…n right-hand rule for polygon rings, and that interior rings must occur after the exterior ring that contains them, as per discussion at June netCDF workshop.
Add bounds to first geometry example. Correct typo in history. Mentio…
change bounds to nodes and qualify their use
@twhiteaker
Copy link
Contributor

Dave submitted an update on 2018-07-20 in which we use "nodes" instead of "bounds" to relate coordinate variables to corresponding node variables. By the three week rule, the date for acceptance into the conventions will be 2018-08-10. Any more discussion? Any chance we can have the procedure worked out by then for actually accepting the pull request?

@davidhassell
Copy link
Contributor

Thanks Dave and Tim - no more discussion from me, apart from the update I just suggested to the conformance document to reflect the new changes. I'll summarise (briefly!) the recent changes on the Trac ticket. Once re-accepted, there would be no problem in using geometries in production, even though the github mechanics may not be sorted out by that time.

All the best, David

@dblodgett-usgs
Copy link
Contributor Author

Dear All,

Now that ticket the github contributing rules have been merged in #137, I think this PR can be merged, right?

  • Dave

@davidhassell
Copy link
Contributor

Right. It will be done.

@dblodgett-usgs
Copy link
Contributor Author

@davidhassell -- I'm a little hesitant to merge my own PR. Do you mind pushing the button?

@ChrisBarker-NOAA
Copy link
Contributor

I agree that it is good practice not to merge one’s own PR. But if it’s a really trivial typo, I say go for it.

Remember that the entire point if got us that you preserve the history and can roll things back if anyone makes a mistake.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.