Skip to content

Latest commit

 

History

History
57 lines (42 loc) · 3.41 KB

new-types.adoc

File metadata and controls

57 lines (42 loc) · 3.41 KB

Use of netCDF-4 data types in CF-2 files

1. Summary

This document proposes using new data types which are available in netCDF-4 (and to a certain extent in netCDF Classic) in files which conform to the CF Conventions. It draws heavily from the CF-1 proposal Add new integer types to CF and attempts to incorporate all relevant material and comments. Less efforts are made to support compatibility with netCDF-3, as CF-2 makes heavy use of features only available in netCDF-4.

The netCDF-4 standard allows the use of several new data types which were not available when the CF Conventions were written.

Atomic external types
  • NC_UBYTE 8-bit unsigned integer

  • NC_USHORT 16-bit unsigned integer

  • NC_UINT 32-bit unsigned integer

  • NC_INT64 64-bit signed integer

  • NC_UINT64 64-bit unsigned integer

  • NC_STRING variable length character string

User-defined types
  • Compound types

  • VLEN types

  • Opaque types

  • Enum types

These types have the potential to allow data producers to encode data with greater accuracy, succinctness expressivity.

Despite these advantages, allowing too many options for encoding data creates the risk of data producers encoding products which users cannot easily read. Therefore, the aim of this proposal is to describe the use of some simple data types which can be interpreted in a straightforward fashion, rather than opening up the use of the full expressive breadth of netCDF-4 for use within the CF Conventions. It is intentionally narrow in scope in order to allow the use of data types which past discussions have shown to be fairly uncontroversial, while establishing guidelines for the use of newer data types which will be used by data producers in the foreseeable future. At this time, those types are the atomic external and enum types. As such, this proposal is not intended to preclude future discussions about the more complex user-defined types (compound, VLEN, and opaque), despite the fact that the proposal is intentionally limited to only a few of the data types that are technically possible.

2. Using new data types in CF-2

Data which makes use of this proposal shall adhere to the following principals:

  • All netCDF atomic external types (char, byte, unsigned byte, short, unsigned short, int, unsigned int, int64, unsigned int64, float or real, double, and string) are acceptable.

  • enum types may be used.

  • Other user-defined types (compound, VLEN and opaque types) are out of the scope of this proposal.

Files indicate adherence to this proposal with a Conventions attribute which includes the string "CF2-Types-X", where X is the release of this standard which is adhered to.

3. Good practices

The following good practices are encouraged:

  • Because unsigned types are now technically possible, the use of signed types to describe data which is by nature unsigned, together with the attributes valid_min and valid_max, or valid_range, is discouraged. Instead, data should be represented in the type which best matches the "real data".

  • Likewise, the use of character arrays to express string-like data is discouraged; instead, the type NC_STRING should be used.

  • One byte numeric data should be stored using the byte or unsigned byte data types.

  • The char type is not intended for numeric data.