-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Geometries Contribution (and github experimentation) #115
Changes from 38 commits
94875c5
c5be80f
54fc86c
177ed57
5f8c085
030ceb4
c0d8cc2
d351c95
958de45
e4ef4b2
9dab561
96005a1
f0cd0e4
cdd8fa0
894df8f
fd74bae
14c7ffb
9756c55
7768e33
9929a61
8c34e5f
c1a224a
a5882f4
e3ff90a
2ebb1df
b6316f4
34eda36
5666769
ef7c9cc
e3276e8
7975d39
91a74e4
edab7e8
d8ea261
90ac4a9
f15f449
64bde66
033cc25
16c9b01
6903131
7ce6b4a
591cb88
f9cef4a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -54,5 +54,4 @@ respect to the specified dimension. | |
| `variance` | __u^2^__ | Variance | ||
|=============== | ||
|
||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
| ||
== Data Representative of Cells | ||
|
||
When gridded data does not represent the point values of a field but instead represents some characteristic of the field within cells of finite "volume," a complete description of the variable should include metadata that describes the domain or extent of each cell, and the characteristic of the field that the cell values represent. It is possible for a single data value to be the result of an operation whose domain is a disjoint set of cells. This is true for many types of climatological averages, for example, the mean January temperature for the years 1970-2000. The methods that we present below for describing cells only provides an association of a grid point with a single cell, not with a collection of cells. However, climatological statistics are of such importance that we provide special methods for describing their associated computational domains in <<climatological-statistics>>. | ||
When gridded data does not represent the point values of a field but instead represents some characteristic of the field within cells of finite "volume," a complete description of the variable should include metadata that describes the domain or extent of each cell, and the characteristic of the field that the cell values represent. It is possible for a single data value to be the result of an operation whose domain is a disjoint set of cells. This is true for many types of climatological averages, for example, the mean January temperature for the years 1970-2000. The methods that we present below for describing cells only provides an association of a grid point with a single cell, not with a collection of cells. However, climatological statistics are of such importance that we provide special methods for describing their associated computational domains in <<climatological-statistics>>. For cases when data pertain to geospatial features with highly variable geometry node counts such as river lines or watershed boundaries, we provide <<geometries> as an alternative to bounds. | ||
|
||
|
||
|
||
|
@@ -584,3 +584,194 @@ data: // time coordinates translated to date/time format | |
"2000-8-1 6:00:00", "2000-9-1 6:00:00" ; | ||
---- | ||
==== | ||
|
||
[[geometries, Section 7.5, "Geometries"]] | ||
=== Geometries | ||
|
||
For many geospatial applications, data values are associated with a geometry, which is a spatial representation of a real-world feature, for instance a time-series of areal average precipitation over a watershed. | ||
Polygonal cells with an arbitrary number of vertices can be described using <<cell-boundaries>>, but in that case every cell must have the same number of vertices and must be a single polygon ring. | ||
In contrast, each geometry may have a different number of nodes, the geometries may be lines (as alternatives to points and polygons), and they may be __multipart__, i.e., include several disjoint parts. | ||
While line and point geometries don't describe an interval along a dimension as the traditional cell bounds described above do, they do describe the extent of a geometry or real-world feature so are included in this section. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @JonathanGregory and @davidhassell - How's this? Not sure what words to use to be consistent with the rest of CF, but I think this is what we were getting at? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @dblodgett-usgs I think this is looking good! Thanks. Some minor comments from my latest read
All the best, David There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @davidhassell Example 7.15 was meant in part to demonstrate that geometry could be encoded without associated data variables. For example, it shows that you would need a grid_mapping attribute on the geometry container variable if there isn't a data variable providing that attribute. Do you think there's merit in showing an example of geometry without a data variable? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @twhiteaker I see, thanks. However, storing coordinates (a "domain") without a data variable is not currently allowed by CF, and I think that to introduce it would need a ticket and discussion in its own right - for example, how would you encode the "usual" regular lat-lon grid without a data variable to bind it all together? This is similar in principle to insisting that there are also representative coordinates for each cell --- bounds without coordinates are also not currently possible. All the best, David There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @davidhassell Seems like it's better to get this proposal accepted and take broader steps like standalone geometries later then. I'll get a revised example 7.15 into the pull request ASAP. |
||
The approach described here specifies how to encode such geometries following the pattern in **9.3.3 Contiguous ragged array representation** and attach them to variables in a way that is consistent with the cell bounds approach. | ||
|
||
All geometries are made up of one or more nodes. | ||
The geometry type specifies the set of topological assumptions to be applied to relate the nodes (see Table 7.1). | ||
For example, multipoint and line geometries are nearly the same except nodes are interpreted as being connected for lines. | ||
Lines and polygons are also nearly the same except that the first and last nodes are assumed to be connected for polygons. | ||
Note that CF does not require the first and last node to be identical but allows them to be coincident if desired. | ||
Polygons that have holes, such as waterbodies in a land unit, are encoded as a collection of polygon ring parts, each identified as __exterior__ or __interior__ polygons. | ||
Multipart geometries, such as multiple lines representing the same river or multiple islands representing the same jurisdiction, are encoded as collections of unconnected points, lines, or polygons that are logically grouped into a single geometry. | ||
|
||
Any data variable can be given a **`geometry`** attribute that indicates the geometry for the quantity held in the variable. | ||
One of the dimensions of the data variable must be the number of geometries to which the data applies. | ||
As shown in Example 7.15, if the data variable has a discrete sampling geometry, the number of geometries is the length of the instance dimension (Section 9.2). | ||
|
||
[["timeseries-with-geometry"]] | ||
[caption="Example 7.15. "] | ||
.Timeseries with geometry. | ||
==== | ||
---- | ||
dimensions: | ||
instance = 2 ; | ||
node = 5 ; | ||
time = 4 ; | ||
variables: | ||
int time(time) ; | ||
time:units = "days since 2000-01-01" ; | ||
double lat(instance) ; | ||
lat:units = "degrees_north" ; | ||
lat:standard_name = "latitude" ; | ||
double lon(instance) ; | ||
lon:units = "degrees_east" ; | ||
lon:standard_name = "longitude" ; | ||
int datum ; | ||
datum:grid_mapping_name = "latitude_longitude" ; | ||
datum:longitude_of_prime_meridian = 0.0 ; | ||
datum:semi_major_axis = 6378137.0 ; | ||
datum:inverse_flattening = 298.257223563 ; | ||
int geometry_container ; | ||
geometry_container:geometry_type = "line" ; | ||
geometry_container:node_count = "node_count" ; | ||
geometry_container:node_coordinates = "x y" ; | ||
int node_count(instance) ; | ||
double x(node) ; | ||
x:units = "degrees_east" ; | ||
x:standard_name = "longitude" ; | ||
x:axis = "X" ; | ||
double y(node) ; | ||
y:units = "degrees_north" ; | ||
y:standard_name = "latitude" ; | ||
y:axis = "Y" ; | ||
double someData(instance, time) ; | ||
someData:coordinates = "time lat lon" ; | ||
someData:grid_mapping = "datum" ; | ||
someData:geometry = "geometry_container" ; | ||
// global attributes: | ||
:Conventions = "CF-1.8" ; | ||
:featureType = "timeSeries" ; | ||
data: | ||
time = 1, 2, 3, 4 ; | ||
lat = 30, 50 ; | ||
lon = 10, 60 ; | ||
someData = | ||
1, 2, 3, 4, | ||
1, 2, 3, 4 ; | ||
node_count = 3, 2 ; | ||
x = 30, 10, 40, 50, 50 ; | ||
y = 10, 30, 40, 60, 50 ; | ||
---- | ||
The time series variable, someData, is associated with line geometries via the geometry attribute. The first line geometry is comprised of three nodes, while the second has two nodes. Client applications unaware of CF geometries can fall back to the lat and lon variables to locate feature instances in space. In this example, lat and lon coordinates are identical to the first node in each line geometry, though any representative point could be used. | ||
==== | ||
|
||
|
||
A __geometry container__ variable acts as a container for attributes that describe a set of geometries. | ||
The **`geometry`** attribute of the data variable contains the name of a geometry container variable. | ||
The geometry container variable must hold **`geometry_type`** and **`node_coordinates`** attributes. | ||
The **`grid_mapping`** and **`coordinates`** attributes can be carried by the geometry container variable provided they are also carried by the data variables associated with the container. | ||
|
||
The **`geometry_type`** attribute indicates the type of geometry present. | ||
Its allowable values are: __point__, __line__, __polygon__. | ||
Multipart geometries are allowed for all three geometry types. | ||
For example, polygon geometries could include single part geometries like the State of Colorado and multipart geometries like the State of Hawaii. | ||
|
||
The **`node_coordinates`** attribute contains the blank-separated names of the variables that contain geometry node coordinates (one variable for each spatial dimension). | ||
The geometry node coordinate variables must each have an **`axis`** attribute whose allowable values are __X__, __Y__, and __Z__. | ||
If a **`coordinates`** attribute is carried by the geometry container variable or its parent data variable, then those coordinate variables which correspond to node coordinate variables must have a **`bounds`** attribute that names the corresponding node coordinate. | ||
The geometry node coordinate variables must all have the same single dimension, which is the total number of nodes in all the geometries. | ||
The nodes must be stored consecutively for each geometry and in the order of the geometries, and within each multipart geometry the nodes must be stored consecutively for each part and in the order of the parts. | ||
Polygon exterior rings must be put in anticlockwise order (viewed from above) and polygon interior rings in clockwise order. | ||
They are put in opposite orders to facilitate calculation of area and consistency with the typical implementation pattern. | ||
|
||
When more than one geometry instance is present, the geometry container variable must have a **`node_count`** attribute that contains the name of a variable indicating the count of nodes per geometry. | ||
The node count is the total number of nodes in all the parts. | ||
The exception is when all geometries are single part point geometries, in which case a node count is not needed since each geometry contains a single node. | ||
|
||
For multipart __lines__, multipart __polygons__, and __polygons__ with holes, the geometry container variable must have a **`part_node_count`** attribute that indicates a variable of the count of nodes per geometry part. | ||
Note that because multipoint geometries always have a single node per part, the **`part_node_count`** is not required for __point__ geometry types. | ||
The single dimension of the part node count variable must equal the total number of parts in all the geometries. | ||
|
||
For __polygon__ geometries with holes, the geometry container variable must have an **`interior_ring`** attribute that contains the name of a variable that indicates if the polygon parts are interior rings (i.e., holes) or not. | ||
This interior ring variable must contain the value 0 to indicate an exterior ring polygon and 1 to indicate an interior ring polygon. | ||
The single dimension of the interior ring variable must be the same dimension as that of the part node count variable. | ||
The geometry types included in this convention are listed in Table 7.1. | ||
|
||
[cols="4"] | ||
|=============== | ||
| geometry_type | Dimensionality | Description of Geometry Instance | Additional required attributes on geometry container variable | ||
|
||
| **point** | 0 | A collection of one or more points, where a point is a single location in space | node_count (if multipart geometries are present) | ||
|
||
| **line** | 1 | A collection of one or more lines, where a line is an ordered set of data points connected by linearly interpolating between points | node_count, part_node_count (if multipart geometries are present) | ||
|
||
| **polygon** | 2 | A collection of one or more polygons, where a polygon is a planar surface comprised of an exterior ring and zero or more interior rings (i.e., holes), where a ring is a closed line (i.e., the last point in the line is assumed to be connected to the first point) | node_count, part_node_count (if holes or multipart geometries are present), interior_ring (if holes are present) | ||
|=============== | ||
|
||
**Table 7.1.** Dimensionality, description, and additional required attributes for geometry_types. | ||
|
||
[[complete-multipolygon-example]] | ||
[caption="Example 7.16. "] | ||
.Polygons with holes | ||
==== | ||
This example demonstrates all potential attributes and variables for encoding geometries. | ||
---- | ||
dimensions: | ||
node = 12 ; | ||
instance = 2 ; | ||
part = 4 ; | ||
time = 4 ; | ||
variables: | ||
int time(time) ; | ||
time:units = "days since 2000-01-01" ; | ||
double x(node) ; | ||
x:units = "degrees_east" ; | ||
x:standard_name = "longitude" ; | ||
x:axis = "X" ; | ||
double y(node) ; | ||
y:units = "degrees_north" ; | ||
y:standard_name = "latitude" ; | ||
y:axis = "Y" ; | ||
double lat(instance) ; | ||
lat:units = "degrees_north" ; | ||
lat:standard_name = "latitude" ; | ||
lat:bounds = "y" ; | ||
double lon(instance) ; | ||
lon:units = "degrees_east" ; | ||
lon:standard_name = "longitude" ; | ||
lon:bounds = "x" ; | ||
float geometry_container ; | ||
geometry_container:geometry_type = "polygon" ; | ||
geometry_container:node_count = "node_count" ; | ||
geometry_container:node_coordinates = "x y" ; | ||
geometry_container:grid_mapping = "datum" ; | ||
geometry_container:coordinates = "lat lon" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since no data variable, I've added coordinates to the geometry_container. Will add a little note about this where we say the grid_mapping can be put on the geometry container. |
||
geometry_container:part_node_count = "part_node_count" ; | ||
geometry_container:interior_ring = "interior_ring" ; | ||
int node_count(instance) ; | ||
int part_node_count(part) ; | ||
int interior_ring(part) ; | ||
float datum ; | ||
datum:grid_mapping_name = "latitude_longitude" ; | ||
datum:semi_major_axis = 6378137. ; | ||
datum:inverse_flattening = 298.257223563 ; | ||
datum:longitude_of_prime_meridian = 0. ; | ||
double someData(instance, time) ; | ||
someData:coordinates = "time lat lon" ; | ||
someData:grid_mapping = "datum" ; | ||
someData:geometry = "geometry_container" ; | ||
// global attributes: | ||
:Conventions = "CF-1.8" ; | ||
:featureType = "timeSeries" ; | ||
data: | ||
time = 1, 2, 3, 4 ; | ||
x = 20, 10, 0, 5, 10, 15, 20, 10, 0, 50, 40, 30 ; | ||
y = 0, 15, 0, 5, 10, 5, 20, 35, 20, 0, 15, 0 ; | ||
lat = 25, 7 ; | ||
lon = 10, 40 ; | ||
node_count = 9, 3 ; | ||
part_node_count = 3, 3, 3, 3 ; | ||
interior_ring = 0, 1, 0, 0 ; | ||
someData = | ||
1, 2, 3, 4, | ||
1, 2, 3, 4 ; | ||
---- | ||
==== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JonathanGregory and @davidhassell - we can now comment on each line individually as needed.