1 Appendix B. File Format Specifications {#file_format_specifications}
2 =======================================
6 \section classic_format_spec The NetCDF Classic Format Specification
8 To present the format more formally, we use a BNF grammar notation. In
10 - Non-terminals (entities defined by grammar rules) are in lower case.
11 - Terminals (atomic entities in terms of which the format
12 specification is written) are in upper case, and are specified
13 literally as US-ASCII characters within single-quote characters or are
14 described with text between angle brackets (‘<’ and ‘>’).
15 - Optional entities are enclosed between braces (‘[’ and ‘]’).
16 - A sequence of zero or more occurrences of an entity is denoted by
18 - A vertical line character (‘|’) separates alternatives. Alternation
19 has lower precedence than concatenation.
20 - Comments follow ‘//’ characters.
21 - A single byte that is not a printable character is denoted using a
22 hexadecimal number with the notation ‘\\xDD’, where each D is a
24 - A literal single-quote character is denoted by ‘\'’, and a literal
25 back-slash character is denoted by ‘\\’.
27 Following the grammar, a few additional notes are included to specify
28 format characteristics that are impractical to capture in a BNF
29 grammar, and to note some special cases for implementers. Comments in
30 the grammar point to the notes and special cases, and help to clarify
31 the intent of elements of the format.
33 <h1>The Format in Detail</h1>
36 netcdf_file = header data
37 header = magic numrecs dim_list gatt_list var_list
38 magic = 'C' 'D' 'F' VERSION
39 VERSION = \\x01 | // classic format
40 \\x02 // 64-bit offset format
41 numrecs = NON_NEG | STREAMING // length of record dimension
42 dim_list = ABSENT | NC_DIMENSION nelems [dim ...]
43 gatt_list = att_list // global attributes
44 att_list = ABSENT | NC_ATTRIBUTE nelems [attr ...]
45 var_list = ABSENT | NC_VARIABLE nelems [var ...]
46 ABSENT = ZERO ZERO // Means list is not present
47 ZERO = \\x00 \\x00 \\x00 \\x00 // 32-bit zero
48 NC_DIMENSION = \\x00 \\x00 \\x00 \\x0A // tag for list of dimensions
49 NC_VARIABLE = \\x00 \\x00 \\x00 \\x0B // tag for list of variables
50 NC_ATTRIBUTE = \\x00 \\x00 \\x00 \\x0C // tag for list of attributes
51 nelems = NON_NEG // number of elements in following sequence
53 name = nelems namestring
54 // Names a dimension, variable, or attribute.
55 // Names should match the regular expression
56 // ([a-zA-Z0-9_]|{MUTF8})([^\\x00-\\x1F/\\x7F-\\xFF]|{MUTF8})*
57 // For other constraints, see "Note on names", below.
58 namestring = ID1 [IDN ...] padding
59 ID1 = alphanumeric | '_'
60 IDN = alphanumeric | special1 | special2
61 alphanumeric = lowercase | uppercase | numeric | MUTF8
62 lowercase = 'a'|'b'|'c'|'d'|'e'|'f'|'g'|'h'|'i'|'j'|'k'|'l'|'m'|
63 'n'|'o'|'p'|'q'|'r'|'s'|'t'|'u'|'v'|'w'|'x'|'y'|'z'
64 uppercase = 'A'|'B'|'C'|'D'|'E'|'F'|'G'|'H'|'I'|'J'|'K'|'L'|'M'|
65 'N'|'O'|'P'|'Q'|'R'|'S'|'T'|'U'|'V'|'W'|'X'|'Y'|'Z'
66 numeric = '0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9'
67 // special1 chars have traditionally been
68 // permitted in netCDF names.
69 special1 = '_'|'.'|'@'|'+'|'-'
70 // special2 chars are recently permitted in
71 // names (and require escaping in CDL).
72 // Note: '/' is not permitted.
73 special2 = ' ' | '!' | '"' | '#' | '$' | '%' | '&' | '\'' |
74 '(' | ')' | '*' | ',' | ':' | ';' | '<' | '=' |
75 '>' | '?' | '[' | '\\' | ']' | '^' | '`' | '{' |
77 MUTF8 = <multibyte UTF-8 encoded, NFC-normalized Unicode character>
78 dim_length = NON_NEG // If zero, this is the record dimension.
79 // There can be at most one record dimension.
80 attr = name nc_type nelems [values ...]
87 var = name nelems [dimid ...] vatt_list nc_type vsize begin
88 // nelems is the dimensionality (rank) of the
89 // variable: 0 for scalar, 1 for vector, 2
91 dimid = NON_NEG // Dimension ID (index into dim_list) for
92 // variable shape. We say this is a "record
93 // variable" if and only if the first
94 // dimension is the record dimension.
95 vatt_list = att_list // Variable-specific attributes
96 vsize = NON_NEG // Variable size. If not a record variable,
97 // the amount of space in bytes allocated to
98 // the variable's data. If a record variable,
99 // the amount of space per record. See "Note
101 begin = OFFSET // Variable start location. The offset in
102 // bytes (seek index) in the file of the
103 // beginning of data for this variable.
105 non_recs = [vardata ...] // The data for all non-record variables,
106 // stored contiguously for each variable, in
107 // the same order the variables occur in the
109 vardata = [values ...] // All data for a non-record variable, as a
110 // block of values of the same type as the
111 // variable, in row-major order (last
112 // dimension varying fastest).
113 recs = [record ...] // The data for all record variables are
114 // stored interleaved at the end of the
116 record = [varslab ...] // Each record consists of the n-th slab
117 // from each record variable, for example
118 // x[n,...], y[n,...], z[n,...] where the
119 // first index is the record number, which
120 // is the unlimited dimension index.
121 varslab = [values ...] // One record of data for a variable, a
122 // block of values all of the same type as
123 // the variable in row-major order (last
124 // index varying fastest).
125 values = bytes | chars | shorts | ints | floats | doubles
126 string = nelems [chars]
127 bytes = [BYTE ...] padding
128 chars = [CHAR ...] padding
129 shorts = [SHORT ...] padding
132 doubles = [DOUBLE ...]
133 padding = <0, 1, 2, or 3 bytes to next 4-byte boundary>
134 // Header padding uses null (\\x00) bytes. In
135 // data, padding uses variable's fill value.
136 // See "Note on padding", below, for a special
138 NON_NEG = <non-negative INT>
139 STREAMING = \\xFF \\xFF \\xFF \\xFF // Indicates indeterminate record
140 // count, allows streaming data
141 OFFSET = <non-negative INT> | // For classic format or
142 <non-negative INT64> // for 64-bit offset format
143 BYTE = <8-bit byte> // See "Note on byte data", below.
144 CHAR = <8-bit byte> // See "Note on char data", below.
145 SHORT = <16-bit signed integer, Bigendian, two's complement>
146 INT = <32-bit signed integer, Bigendian, two's complement>
147 INT64 = <64-bit signed integer, Bigendian, two's complement>
148 FLOAT = <32-bit IEEE single-precision float, Bigendian>
149 DOUBLE = <64-bit IEEE double-precision float, Bigendian>
150 // following type tags are 32-bit integers
151 NC_BYTE = \\x00 \\x00 \\x00 \\x01 // 8-bit signed integers
152 NC_CHAR = \\x00 \\x00 \\x00 \\x02 // text characters
153 NC_SHORT = \\x00 \\x00 \\x00 \\x03 // 16-bit signed integers
154 NC_INT = \\x00 \\x00 \\x00 \\x04 // 32-bit signed integers
155 NC_FLOAT = \\x00 \\x00 \\x00 \\x05 // IEEE single precision floats
156 NC_DOUBLE = \\x00 \\x00 \\x00 \\x06 // IEEE double precision floats
157 // Default fill values for each type, may be
158 // overridden by variable attribute named
159 // '_FillValue'. See "Note on fill values",
161 FILL_CHAR = \\x00 // null byte
162 FILL_BYTE = \\x81 // (signed char) -127
163 FILL_SHORT = \\x80 \\x01 // (short) -32767
164 FILL_INT = \\x80 \\x00 \\x00 \\x01 // (int) -2147483647
165 FILL_FLOAT = \\x7C \\xF0 \\x00 \\x00 // (float) 9.9692099683868690e+36
166 FILL_DOUBLE = \\x47 \\x9E \\x00 \\x00 \\x00 \\x00 \\x00 \\x00 //(double)9.9692099683868690e+36
169 Note on vsize: This number is the product of the dimension lengths
170 (omitting the record dimension) and the number of bytes per value
171 (determined from the type), increased to the next multiple of 4, for
172 each variable. If a record variable, this is the amount of space per
173 record (except that, for backward compatibility, it always includes
174 padding to the next multiple of 4 bytes, even in the exceptional case
175 noted below under “Note on padding”). The netCDF “record size” is
176 calculated as the sum of the vsize's of all the record variables.
178 The vsize field is actually redundant, because its value may be
179 computed from other information in the header. The 32-bit vsize field
180 is not large enough to contain the size of variables that require more
181 than 2^32 - 4 bytes, so 2^32 - 1 is used in the vsize field for such
184 Note on names: Earlier versions of the netCDF C-library reference
185 implementation enforced a more restricted set of characters in
186 creating new names, but permitted reading names containing arbitrary
187 bytes. This specification extends the permitted characters in names to
188 include multi-byte UTF-8 encoded Unicode and additional printing
189 characters from the US-ASCII alphabet. The first character of a name
190 must be alphanumeric, a multi-byte UTF-8 character, or '_' (reserved
191 for special names with meaning to implementations, such as the
192 “_FillValue” attribute). Subsequent characters may also include
193 printing special characters, except for '/' which is not allowed in
194 names. Names that have trailing space characters are also not
197 Implementations of the netCDF classic and 64-bit offset format must
198 ensure that names are normalized according to Unicode NFC
199 normalization rules during encoding as UTF-8 for storing in the file
200 header. This is necessary to ensure that gratuitous differences in the
201 representation of Unicode names do not cause anomalies in comparing
202 files and querying data objects by name.
204 Note on streaming data: The largest possible record count, 2^32 - 1,
205 is reserved to indicate an indeterminate number of records. This means
206 that the number of records in the file must be determined by other
207 means, such as reading them or computing the current number of records
208 from the file length and other information in the header. It also
209 means that the numrecs field in the header will not be updated as
210 records are added to the file. [This feature is not yet implemented].
212 Note on padding: In the special case when there is only one record
213 variable and it is of type character, byte, or short, no padding is
214 used between record slabs, so records after the first record do not
215 necessarily start on four-byte boundaries. However, as noted above
216 under “Note on vsize”, the vsize field is computed to include padding
217 to the next multiple of 4 bytes. In this case, readers should ignore
218 vsize and assume no padding. Writers should store vsize as if padding
221 Note on byte data: It is possible to interpret byte data as either
222 signed (-128 to 127) or unsigned (0 to 255). When reading byte data
223 through an interface that converts it into another numeric type, the
224 default interpretation is signed. There are various attribute
225 conventions for specifying whether bytes represent signed or unsigned
226 data, but no standard convention has been established. The variable
227 attribute “_Unsigned” is reserved for this purpose in future
230 Note on char data: Although the characters used in netCDF names must
231 be encoded as UTF-8, character data may use other encodings. The
232 variable attribute “_Encoding” is reserved for this purpose in future
235 Note on fill values: Because data variables may be created before
236 their values are written, and because values need not be written
237 sequentially in a netCDF file, default “fill values” are defined for
238 each type, for initializing data values before they are explicitly
239 written. This makes it possible to detect reading values that were
240 never written. The variable attribute “_FillValue”, if present,
241 overrides the default fill value for a variable. If _FillValue is
242 defined then it should be scalar and of the same type as the variable.
244 Fill values are not required, however, because netCDF libraries have
245 traditionally supported a “no fill” mode when writing, omitting the
246 initialization of variable values with fill values. This makes the
247 creation of large files faster, but also eliminates the possibility of
248 detecting the inadvertent reading of values that haven't been written.
250 \section computing_offsets Notes on Computing File Offsets
252 The offset (position within the file) of a specified data value in a
253 classic format or 64-bit offset data file is completely determined by
254 the variable start location (the offset in the begin field), the
255 external type of the variable (the nc_type field), and the dimension
256 indices (one for each of the variable's dimensions) of the value
259 The external size in bytes of one data value for each possible netCDF
260 type, denoted extsize below, is:
268 The record size, denoted by recsize below, is the sum of the vsize
269 fields of record variables (variables that use the unlimited
270 dimension), using the actual value determined by dimension sizes and
271 variable type in case the vsize field is too small for the variable
274 To compute the offset of a value relative to the beginning of a
275 variable, it is helpful to precompute a “product vector” from the
276 dimension lengths. Form the products of the dimension lengths for the
277 variable from right to left, skipping the leftmost (record) dimension
278 for record variables, and storing the results as the product vector
286 dimension lengths: [ 5 3 2 7] product vector: [210 42 14 7]
290 dimension lengths: [0 2 9 4] product vector: [0 72 36 4]
293 At this point, the leftmost product, when rounded up to the next
294 multiple of 4, is the variable size, vsize, in the grammar above. For
295 example, in the non-record variable above, the value of the vsize
296 field is 212 (210 rounded up to a multiple of 4). For the record
297 variable, the value of vsize is just 72, since this is already a
300 Let coord be the array of coordinates (dimension indices, zero-based)
301 of the desired data value. Then the offset of the value from the
302 beginning of the file is just the file offset of the first data value
303 of the desired variable (its begin field) added to the inner product
304 of the coord and product vectors times the size, in bytes, of each
305 datum for the variable. Finally, if the variable is a record variable,
306 the product of the record number, 'coord[0]', and the record size,
307 recsize, is added to yield the final offset value.
309 A special case: Where there is exactly one record variable, we drop
310 the requirement that each record be four-byte aligned, so in this case
311 there is no record padding.
313 \subsection offset_examples Examples
315 By using the grammar above, we can derive the smallest valid netCDF
316 file, having no dimensions, no variables, no attributes, and hence, no
317 data. A CDL representation of the empty netCDF file is
323 This empty netCDF file has 32 bytes. It begins with the four-byte
324 “magic number” that identifies it as a netCDF version 1 file: ‘C’,
325 ‘D’, ‘F’, ‘\\x01’. Following are seven 32-bit integer zeros
326 representing the number of records, an empty list of dimensions, an
327 empty list of global attributes, and an empty list of variables.
329 Below is an (edited) dump of the file produced using the Unix command
335 Each 16-byte portion of the file is displayed with 4 lines. The first
336 line displays the bytes in hexadecimal. The second line displays the
337 bytes as characters. The third line displays each group of two bytes
338 interpreted as a signed 16-bit integer. The fourth line (added by
339 human) presents the interpretation of the bytes in terms of netCDF
340 components and values.
343 4344 4601 0000 0000 0000 0000 0000 0000
344 C D F 001 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
345 17220 17921 00000 00000 00000 00000 00000 00000
346 [magic number ] [ 0 records ] [ 0 dimensions (ABSENT) ]
348 0000 0000 0000 0000 0000 0000 0000 0000
349 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
350 00000 00000 00000 00000 00000 00000 00000 00000
351 [ 0 global atts (ABSENT) ] [ 0 variables (ABSENT) ]
354 As a less trivial example, consider the CDL
367 which corresponds to a 92-byte netCDF file. The following is an edited dump of this file:
370 4344 4601 0000 0000 0000 000a 0000 0001
371 C D F 001 \0 \0 \0 \0 \0 \0 \0 \n \0 \0 \0 001
372 17220 17921 00000 00000 00000 00010 00000 00001
373 [magic number ] [ 0 records ] [NC_DIMENSION ] [ 1 dimension ]
375 0000 0003 6469 6d00 0000 0005 0000 0000
376 \0 \0 \0 003 d i m \0 \0 \0 \0 005 \0 \0 \0 \0
377 00000 00003 25705 27904 00000 00005 00000 00000
378 [ 3 char name = "dim" ] [ size = 5 ] [ 0 global atts
380 0000 0000 0000 000b 0000 0001 0000 0002
381 \0 \0 \0 \0 \0 \0 \0 013 \0 \0 \0 001 \0 \0 \0 002
382 00000 00000 00000 00011 00000 00001 00000 00002
383 (ABSENT) ] [NC_VARIABLE ] [ 1 variable ] [ 2 char name =
385 7678 0000 0000 0001 0000 0000 0000 0000
386 v x \0 \0 \0 \0 \0 001 \0 \0 \0 \0 \0 \0 \0 \0
387 30328 00000 00000 00001 00000 00000 00000 00000
388 "vx" ] [1 dimension ] [ with ID 0 ] [ 0 attributes
390 0000 0000 0000 0003 0000 000c 0000 0050
391 \0 \0 \0 \0 \0 \0 \0 003 \0 \0 \0 \f \0 \0 \0 P
392 00000 00000 00000 00003 00000 00012 00000 00080
393 (ABSENT) ] [type NC_SHORT] [size 12 bytes] [offset: 80]
395 0003 0001 0004 0001 0005 8001
396 \0 003 \0 001 \0 004 \0 001 \0 005 200 001
397 00003 00001 00004 00001 00005 -32767
398 [ 3] [ 1] [ 4] [ 1] [ 5] [fill ]
401 \section offset_format_spec The 64-bit Offset Format
403 The netCDF 64-bit offset format differs from the classic format only
404 in the VERSION byte, ‘\\x02’ instead of ‘\\x01’, and the OFFSET entity,
405 a 64-bit instead of a 32-bit offset from the beginning of the
406 file. This small format change permits much larger files, but there
407 are still some practical size restrictions. Each fixed-size variable
408 and the data for one record's worth of each record variable are still
409 limited in size to a little less that 4 GiB. The rationale for this
410 limitation is to permit aggregate access to all the data in a netCDF
411 variable (or a record's worth of data) on 32-bit platforms.
413 \section netcdf_4_spec The NetCDF-4 Format
415 The netCDF-4 format implements and expands the netCDF-3 data model by
416 using an enhanced version of HDF5 as the storage layer. Use is made of
417 features that are only available in HDF5 version 1.8 and later.
419 Using HDF5 as the underlying storage layer, netCDF-4 files remove many
420 of the restrictions for classic and 64-bit offset files. The richer
421 enhanced model supports user-defined types and data structures,
422 hierarchical scoping of names using groups, additional primitive types
423 including strings, larger variable sizes, and multiple unlimited
424 dimensions. The underlying HDF5 storage layer also supports
425 per-variable compression, multidimensional tiling, and efficient
426 dynamic schema changes, so that data need not be copied when adding
427 new variables to the file schema.
429 Creating a netCDF-4/HDF5 file with netCDF-4 results in an HDF5
430 file. The features of netCDF-4 are a subset of the features of HDF5,
431 so the resulting file can be used by any existing HDF5 application.
433 Although every file in netCDF-4 format is an HDF5 file, there are HDF5
434 files that are not netCDF-4 format files, because the netCDF-4 format
435 intentionally uses a limited subset of the HDF5 data model and file
436 format features. Some HDF5 features not supported in the netCDF
437 enhanced model and netCDF-4 format include non-hierarchical group
438 structures, HDF5 reference types, multiple links to a data object,
439 user-defined atomic data types, stored property lists, more permissive
440 rules for data object names, the HDF5 date/time type, and attributes
441 associated with user-defined types.
443 A complete specification of HDF5 files is beyond the scope of this
444 document. For more information about HDF5, see the HDF5 web site:
445 http://hdf.ncsa.uiuc.edu/HDF5/.
447 The specification that follows is sufficient to allow HDF5 users to
448 create files that will be accessible from netCDF-4.
450 \subsection creation_order Creation Order
452 The netCDF API maintains the creation order of objects that are
453 created in the file. The same is not true in HDF5, which maintains the
454 objects in alphabetical order. Starting in version 1.8 of HDF5, the
455 ability to maintain creation order was added. This must be explicitly
456 turned on in the HDF5 data file in several ways.
458 Each group must have link and attribute creation order set. The
459 following code (from libsrc4/nc4hdf.c) shows how the netCDF-4 library
460 sets these when creating a group.
463 /* Create group, with link_creation_order set in the group
464 * creation property list. */
465 if ((gcpl_id = H5Pcreate(H5P_GROUP_CREATE)) < 0)
467 if (H5Pset_link_creation_order(gcpl_id, H5P_CRT_ORDER_TRACKED|H5P_CRT_ORDER_INDEXED) < 0)
469 if (H5Pset_attr_creation_order(gcpl_id, H5P_CRT_ORDER_TRACKED|H5P_CRT_ORDER_INDEXED) < 0)
471 if ((grp->hdf_grpid = H5Gcreate2(grp->parent->hdf_grpid, grp->name,
472 H5P_DEFAULT, gcpl_id, H5P_DEFAULT)) < 0)
474 if (H5Pclose(gcpl_id) < 0)
478 Each dataset in the HDF5 file must be created with a property list for
479 which the attribute creation order has been set to creation
480 ordering. The H5Pset_attr_creation_order function is used to set the
481 creation ordering of attributes of a variable.
483 The following example code (from libsrc4/nc4hdf.c) shows how the
484 creation ordering is turned on by the netCDF library.
487 /* Turn on creation order tracking. */
488 if (H5Pset_attr_creation_order(plistid, H5P_CRT_ORDER_TRACKED|
489 H5P_CRT_ORDER_INDEXED) < 0)
493 \subsection groups_spec Groups
495 NetCDF-4 groups are the same as HDF5 groups, but groups in a netCDF-4
496 file must be strictly hierarchical. In general, HDF5 permits
497 non-hierarchical structuring of groups (for example, a group that is
498 its own grandparent). These non-hierarchical relationships are not
499 allowed in netCDF-4 files.
501 In the netCDF API, the global attribute becomes a group-level
502 attribute. That is, each group may have its own global attributes.
504 The root group of a file is named “/” in the netCDF API, where names
505 of groups are used. It should be noted that the netCDF API (like the
506 HDF5 API) makes little use of names, and refers to entities by number.
508 \subsection dims_spec Dimensions with HDF5 Dimension Scales
510 Until version 1.8, HDF5 did not have any capability to represent
511 shared dimensions. With the 1.8 release, HDF5 introduced the dimension
512 scale feature to allow shared dimensions in HDF5 files.
514 The dimension scale is unfortunately not exactly equivalent to the
515 netCDF shared dimension, and this leads to a number of compromises in
516 the design of netCDF-4.
518 A netCDF shared dimension consists solely of a length and a name. An
519 HDF5 dimension scale also includes values for each point along the
520 dimension, information that is (optionally) included in a netCDF
523 To handle the case of a netCDF dimension without a coordinate
524 variable, netCDF-4 creates dimension scales of type char, and leaves
525 the contents of the dimension scale empty. Only the name and length of
526 the scale are significant. To distinguish this case, netCDF-4 takes
527 advantage of the NAME attribute of the dimension scale. (Not to be
528 confused with the name of the scale itself.) In the case of dimensions
529 without coordinate data, the HDF5 dimension scale NAME attribute is
530 set to the string: "This is a netCDF dimension but not a netCDF
533 In the case where a coordinate variable is defined for a dimension,
534 the HDF5 dimscale matches the type of the netCDF coordinate variable,
535 and contains the coordinate data.
537 A further difficulty arrises when an n-dimensional coordinate variable
538 is defined, where n is greater than one. NetCDF allows such coordinate
539 variables, but the HDF5 model does not allow dimension scales to be
540 attached to other dimension scales, making it impossible to completely
541 represent the multi-dimensional coordinate variables of the netCDF
544 To capture this information, multidimensional coordinate variables
545 have an attribute named _Netcdf4Coordinates. The attribute is an array
546 of H5T_NATIVE_INT, with the netCDF dimension IDs of each of its
549 The _Netcdf4Coordinates attribute is otherwise hidden by the netCDF
550 API. It does not appear as one of the attributes for the netCDF
551 variable involved, except through the HDF5 API.
553 \subsection dim_spec2 Dimensions without HDF5 Dimension Scales
555 Starting with the netCDF-4.1 release, netCDF can read HDF5 files which
556 do not use dimension scales. In this case the netCDF library assigns
557 dimensions to the HDF5 dataset as needed, based on the length of the
560 When an HDF5 file is opened, each dataset is examined in turn. The
561 lengths of all the dimensions involved in the shape of the dataset are
562 determined. Each new (i.e. previously unencountered) length results in
563 the creation of a phony dimension in the netCDF API.
565 This will not accurately detect a shared, unlimited dimension in the
566 HDF5 file, if different datasets have different lengths along this
567 dimension (possible in HDF5, but not in netCDF).
569 Note that this is a read-only capability for the netCDF library. When
570 the netCDF library writes HDF5 files, they always use a dimension
571 scale for every dimension.
573 Datasets must have either dimension scales for every dimension, or no
574 dimension scales at all. Partial dimension scales are not, at this
575 time, understood by the netCDF library.
577 \subsection dim_spec3 Dimension and Coordinate Variable Ordering
579 In order to preserve creation order, the netCDF-4 library writes
580 variables in their creation order. Since some variables are also
581 dimension scales, their order reflects both the order of the
582 dimensions and the order of the coordinate variables.
584 However, these may be different. Consider the following code:
587 /* Create a test file. */
588 if (nc_create(FILE_NAME, NC_CLASSIC_MODEL|NC_NETCDF4, &ncid)) ERR;
590 /* Define dimensions in order. */
591 if (nc_def_dim(ncid, DIM0, NC_UNLIMITED, &dimids[0])) ERR;
592 if (nc_def_dim(ncid, DIM1, 4, &dimids[1])) ERR;
594 /* Define coordinate variables in a different order. */
595 if (nc_def_var(ncid, DIM1, NC_DOUBLE, 1, &dimids[1], &varid[1])) ERR;
596 if (nc_def_var(ncid, DIM0, NC_DOUBLE, 1, &dimids[0], &varid[0])) ERR;
599 In this case the order of the coordinate variables will be different
600 from the order of the dimensions.
602 In practice, this should make little difference in user code, but if
603 the user is writing code that depends on the ordering of dimensions,
604 the netCDF library was updated in version 4.1 to detect this
605 condition, and add the attribute _Netcdf4Dimid to the dimension scales
606 in the HDF5 file. This attribute holds a scalar H5T_NATIVE_INT which
607 is the (zero-based) dimension ID for this dimension.
609 If this attribute is present on any dimension scale, it must be
610 present on all dimension scales in the file.
612 \subsection vars_spec Variables
614 Variables in netCDF-4/HDF5 files exactly correspond to HDF5
615 datasets. The data types match naturally between netCDF and HDF5.
617 In netCDF classic format, the problem of endianness is solved by
618 writing all data in big-endian order. The HDF5 library allows data to
619 be written as either big or little endian, and automatically reorders
620 the data when it is read, if necessary.
622 By default, netCDF uses the native types on the machine which writes
623 the data. Users may change the endianness of a variable (before any
624 data are written). In that case the specified endian type will be used
625 in HDF5 (for example, a H5T_STD_I16LE will be used for NC_SHORT, if
626 little-endian has been specified for that variable.)
627 - NC_BYTE = H5T_NATIVE_SCHAR
628 - NC_UBYTE = H5T_NATIVE_UCHAR
630 - NC_STRING = variable length array of H5T_C_S1
631 - NC_SHORT = H5T_NATIVE_SHORT
632 - NC_USHORT = H5T_NATIVE_USHORT
633 - NC_INT = H5T_NATIVE_INT
634 - NC_UINT = H5T_NATIVE_UINT
635 - NC_INT64 = H5T_NATIVE_LLONG
636 - NC_UINT64 = H5T_NATIVE_ULLONG
637 - NC_FLOAT = H5T_NATIVE_FLOAT
638 - NC_DOUBLE = H5T_NATIVE_DOUBLE
640 The NC_CHAR type represents a single character, and the NC_STRING an
641 array of characters. This can be confusing because a one-dimensional
642 array of NC_CHAR is used to represent a string (i.e. a scalar
645 An odd case may arise in which the user defines a variable with the
646 same name as a dimension, but which is not intended to be the
647 coordinate variable for that dimension. In this case the string
648 "_nc4_non_coord_" is pre-pended to the name of the HDF5 dataset, and
649 stripped from the name for the netCDF API.
651 \subsection atts_spec Attributes
653 Attributes in HDF5 and netCDF-4 correspond very closely. Each
654 attribute in an HDF5 file is represented as an attribute in the
655 netCDF-4 file, with the exception of the attributes below, which are
656 hidden by the netCDF-4 API.
657 - _Netcdf4Coordinates An integer array containing the dimension IDs of
658 a variable which is a multi-dimensional coordinate variable.
659 - _nc3_strict When this (scalar, H5T_NATIVE_INT) attribute exists in
660 the root group of the HDF5 file, the netCDF API will enforce the
661 netCDF classic model on the data file.
662 - REFERENCE_LIST This attribute is created and maintained by the HDF5
664 - CLASS This attribute is created and maintained by the HDF5 dimension
666 - DIMENSION_LIST This attribute is created and maintained by the HDF5
668 - NAME This attribute is created and maintained by the HDF5 dimension
670 - _Netcdf4Dimid Holds a scalar H5T_NATIVE_INT that is the (zero-based)
671 dimension ID for this dimension, needed when dimensions and
672 coordinate variables are defined in different orders.
673 - _NCProperties Holds provenance information about a file at the time
674 it was created. It specifies the versions of the netCDF and HDF5
675 libraries used to create the file.
677 \subsection user_defined_spec User-Defined Data Types
679 Each user-defined data type in an HDF5 file exactly corresponds to a
680 user-defined data type in the netCDF-4 file. Only base data types
681 which correspond to netCDF-4 data types may be used. (For example, no
682 HDF5 reference data types may be used.)
684 \subsection compression_spec Compression
686 The HDF5 library provides data compression using the zlib library and
687 the szlib library. NetCDF-4 only allows users to create data with the
688 zlib library (due to licensing restrictions on the szlib
689 library). Since HDF5 supports the transparent reading of the data with
690 either compression filter, the netCDF-4 library can read data
691 compressed with szlib (if the underlying HDF5 library is built to
692 support szlib), but has no way to write data with szlib compression.
694 With zlib compression (a.k.a. deflation) the user may set a deflation
695 factor from 0 to 9. In our measurements the zero deflation level does
696 not compress the data, but does incur the performance penalty of
697 compressing the data. The netCDF API does not allow the user to write
698 a variable with zlib deflation of 0 - when asked to do so, it turns
699 off deflation for the variable instead. NetCDF can read an HDF5 file
700 with deflation of zero, and correctly report that to the user.
702 \section netcdf_4_classic_spec The NetCDF-4 Classic Model Format
704 Every classic and 64-bit offset file can be represented as a netCDF-4
705 file, with no loss of information. There are some significant benefits
706 to using the simpler netCDF classic model with the netCDF-4 file
707 format. For example, software that writes or reads classic model data
708 can write or read netCDF-4 classic model format data by
709 recompiling/relinking to a netCDF-4 API library, with no or only
710 trivial changes needed to the program source code. The netCDF-4
711 classic model format supports this usage by enforcing rules on what
712 functions may be called to store data in the file, to make sure its
713 data can be read by older netCDF applications (when relinked to a
716 Writing data in this format prevents use of enhanced model features
717 such as groups, added primitive types not available in the classic
718 model, and user-defined types. However performance features of the
719 netCDF-4 formats that do not require additional features of the
720 enhanced model, such as per-variable compression and chunking,
721 efficient dynamic schema changes, and larger variable size limits,
722 offer potentially significant performance improvements to readers of
723 data stored in this format, without requiring program changes.
725 When a file is created via the netCDF API with a CLASSIC_MODEL mode
726 flag, the library creates an attribute (_nc3_strict) in the root
727 group. This attribute is hidden by the netCDF API, but is read when
728 the file is later opened, and used to ensure that no enhanced model
729 features are written to the file.
731 \section hdf4_sd_format HDF4 SD Format
733 Starting with version 4.1, the netCDF libraries can read HDF4 SD
734 (Scientific Dataset) files. Access is limited to those HDF4 files
735 created with the Scientific Dataset API. Access is read-only.
737 Dataset types are translated between HDF4 and netCDF in a
738 straightforward manner.
739 - DFNT_CHAR = NC_CHAR
740 - DFNT_UCHAR, DFNT_UINT8 = NC_UBYTE
741 - DFNT_INT8 = NC_BYTE
742 - DFNT_INT16 = NC_SHORT
743 - DFNT_UINT16 = NC_USHORT
744 - DFNT_INT32 = NC_INT
745 - DFNT_UINT32 = NC_UINT
746 - DFNT_FLOAT32 = NC_FLOAT
747 - DFNT_FLOAT64 = NC_DOUBLE