NetCDF Users Guide
******************

This guide describes the netCDF object model. This document applies to
netCDF version 3.6.3, and was last updated on 7 June 2008.

   Interface guides are available for C (*note The NetCDF C Interface
Guide: (netcdf-c)Top.), C++ (*note The NetCDF C++ Interface Guide:
(netcdf-cxx)Top.), Fortran 77 (*note The NetCDF Fortran 77 Interface
Guide: (netcdf-f77)Top.), and Fortran 90 (*note The NetCDF Fortran 90
Interface Guide: (netcdf-f90)Top.).

   Separate documentation for the netCDF Java library can be found at
the netCDF-Java website,
`http://www.unidata.ucar.edu/software/netcdf-java'.

   For installation and porting information, see *Note The NetCDF
Installation and Porting Guide: (netcdf-install)Top.

Foreword
********

Unidata (`http://www.unidata.ucar.edu') is a National Science
Foundation-sponsored program empowering U.S. universities, through
innovative applications of computers and networks, to make the best use
of atmospheric and related data for enhancing education and research.
For analyzing and displaying such data, the Unidata Program Center
offers universities several supported software packages developed by
other organizations. Underlying these is a Unidata-developed system for
acquiring and managing data in real time, making practical the Unidata
principle that each university should acquire and manage its own data
holdings as local requirements dictate. It is significant that the
Unidata program has no data center--the management of data is a
"distributed" function.

   The Network Common Data Form (netCDF) software described in this
guide was originally intended to provide a common data access method
for the various Unidata applications. These deal with a variety of data
types that encompass single-point observations, time series,
regularly-spaced grids, and satellite or radar images.

   The netCDF software functions as an I/O library, callable from C,
FORTRAN, C++, Perl, or other language for which a netCDF library is
available. The library stores and retrieves data in self-describing,
machine-independent datasets. Each netCDF dataset can contain
multidimensional, named variables (with differing types that include
integers, reals, characters, bytes, etc.), and each variable may be
accompanied by ancillary data, such as units of measure or descriptive
text. The interface includes a method for appending data to existing
netCDF datasets in prescribed ways, functionality that is not unlike a
(fixed length) record structure. However, the netCDF library also
allows direct-access storage and retrieval of data by variable name and
index and therefore is useful only for disk-resident (or
memory-resident) datasets.

   NetCDF access has been implemented in about half of Unidata's
software, so far, and it is planned that such commonality will extend
across all Unidata applications in order to:

   * Facilitate the use of common datasets by distinct applications.

   * Permit datasets to be transported between or shared by dissimilar
     computers transparently, i.e., without translation.

   * Reduce the programming effort usually spent interpreting formats.

   * Reduce errors arising from misinterpreting data and ancillary data.

   * Facilitate using output from one application as input to another.

   * Establish an interface standard which simplifies the inclusion of
     new software into the Unidata system.


   A measure of success has been achieved. NetCDF is now in use on
computing platforms that range from personal computers to
supercomputers and include most UNIX-based workstations. It can be used
to create a complex dataset on one computer (say in FORTRAN) and
retrieve that same self-describing dataset on another computer (say in
C) without intermediate translations--netCDF datasets can be
transferred across a network, or they can be accessed remotely using a
suitable network file system or remote access protocols.

   Because we believe that the use of netCDF access in non-Unidata
software will benefit Unidata's primary constituency--such use may
result in more options for analyzing and displaying Unidata
information--the netCDF library is distributed without licensing or
other significant restrictions, and current versions can be obtained
via anonymous FTP. Apparently the software has been well received by a
wide range of institutions beyond the atmospheric science community,
and a substantial number of public domain and commercial data analysis
systems can now accept netCDF datasets as input.

   Several organizations have adopted netCDF as a data access standard,
and there is an effort underway at the National Center for
Supercomputer Applications (NCSA, which is associated with the
University of Illinois at Urbana-Champaign) to support the netCDF
programming interfaces as a means to store and retrieve data in "HDF
files," i.e., in the format used by the popular NCSA tools. We have
encouraged and cooperated with these efforts.

   Questions occasionally arise about the level of support provided for
the netCDF software. Unidata's formal position, stated in the copyright
notice which accompanies the netCDF library, is that the software is
provided "as is". In practice, the software is updated from time to
time, and Unidata intends to continue making improvements for the
foreseeable future. Because Unidata's mission is to serve geoscientists
at U.S. universities, problems reported by that community necessarily
receive the greatest attention.

   We hope the reader will find the software useful and will give us
feedback on its application as well as suggestions for its improvement.

   David Fulker, 1996

   Unidata Program Center Director, University Corporation for
Atmospheric Research

Summary
*******

The purpose of the Network Common Data Form (netCDF) interface is to
allow you to create, access, and share array-oriented data in a form
that is self-describing and portable. "Self-describing" means that a
dataset includes information defining the data it contains. "Portable"
means that the data in a dataset is represented in a form that can be
accessed by computers with different ways of storing integers,
characters, and floating-point numbers. Using the netCDF interface for
creating new datasets makes the data portable. Using the netCDF
interface in software for data access, management, analysis, and
display can make the software more generally useful.

   The netCDF software includes C, Fortran 77, Fortran 90, and C++
interfaces for accessing netCDF data. These libraries are available for
many common computing platforms.

   The community of netCDF users has contributed ports of the software
to additional platforms and interfaces for other programming languages
as well. Source code for netCDF software libraries is freely available
to encourage the sharing of both array-oriented data and the software
that makes the data useful.

   This User's Guide presents the netCDF data model. It explains how the
netCDF data model uses dimensions, variables, and attributes to store
data. Language specific programming guides are available for C (*note
The NetCDF C Interface Guide: (netcdf-c)Top.), C++ (*note The NetCDF
C++ Interface Guide: (netcdf-cxx)Top.), Fortran 77 (*note The NetCDF
Fortran 77 Interface Guide: (netcdf-f77)Top.), and Fortran 90 (*note
The NetCDF Fortran 90 Interface Guide: (netcdf-f90)Top.).

   Reference documentation for UNIX systems, in the form of UNIX 'man'
pages for the C and FORTRAN interfaces is also available at the netCDF
web site (`http://www.unidata.ucar.edu/software/netcdf'), and with the
netCDF distribution.

   The latest version of this document, and the language specific
guides, can be found at the netCDF web site,
`http://www.unidata.ucar.edu/software/netcdf', along with extensive
additional information about netCDF, including pointers to other
software that works with netCDF data.

   Separate documentation of the Java netCDF library can be found at
`http://www.unidata.ucar.edu/software/netcdf-java'.

   For installation and porting information *Note The NetCDF
Installation and Porting Guide: (netcdf-install)Top.

1 Introduction
**************

1.1 The NetCDF Interface
========================

The Network Common Data Form, or netCDF, is an interface to a library
of data access functions for storing and retrieving data in the form of
arrays. An array is an n-dimensional (where n is 0, 1, 2, ...)
rectangular structure containing items which all have the same data
type (e.g., 8-bit character, 32-bit integer). A "scalar" (simple single
value) is a 0-dimensional array.

   NetCDF is an abstraction that supports a view of data as a collection
of self-describing, portable objects that can be accessed through a
simple interface. Array values may be accessed directly, without
knowing details of how the data are stored. Auxiliary information about
the data, such as what units are used, may be stored with the data.
Generic utilities and application programs can access netCDF datasets
and transform, combine, analyze, or display specified fields of the
data. The development of such applications has led to improved
accessibility of data and improved re-usability of software for
array-oriented data management, analysis, and display.

   The netCDF software implements an abstract data type, which means
that all operations to access and manipulate data in a netCDF dataset
must use only the set of functions provided by the interface. The
representation of the data is hidden from applications that use the
interface, so that how the data are stored could be changed without
affecting existing programs. The physical representation of netCDF data
is designed to be independent of the computer on which the data were
written.

   Unidata supports the netCDF interfaces for C, (*note The NetCDF C
Interface Guide: (netcdf-c)Top.), FORTRAN 77 (*note The NetCDF Fortran
77 Interface Guide: (netcdf-f77)Top.), FORTRAN 90 (*note The NetCDF
Fortran 90 Interface Guide: (netcdf-f90)Top.), and C++ (*note The
NetCDF C++ Interface Guide: (netcdf-cxx)Top.).

   The netCDF library is supported for various UNIX operating systems. A
MS Windows port is also available. The software is also ported and
tested on a few other operating systems, with assistance from users
with access to these systems, before each major release. Unidata's
netCDF software is freely available via FTP to encourage its widespread
use. (`ftp://ftp.unidata.ucar.edu/pub/netcdf').

   For detailed installation instructions, see the Porting and
Installation Guide. *Note The NetCDF Installation and Porting Guide:
(netcdf-install)Top.

1.2 NetCDF Is Not a Database Management System
==============================================

Why not use an existing database management system for storing
array-oriented data? Relational database software is not suitable for
the kinds of data access supported by the netCDF interface.

   First, existing database systems that support the relational model do
not support multidimensional objects (arrays) as a basic unit of data
access. Representing arrays as relations makes some useful kinds of
data access awkward and provides little support for the abstractions of
multidimensional data and coordinate systems. A quite different data
model is needed for array-oriented data to facilitate its retrieval,
modification, mathematical manipulation and visualization.

   Related to this is a second problem with general-purpose database
systems: their poor performance on large arrays. Collections of
satellite images, scientific model outputs and long-term global weather
observations are beyond the capabilities of most database systems to
organize and index for efficient retrieval.

   Finally, general-purpose database systems provide, at significant
cost in terms of both resources and access performance, many facilities
that are not needed in the analysis, management, and display of
array-oriented data. For example, elaborate update facilities, audit
trails, report formatting, and mechanisms designed for
transaction-processing are unnecessary for most scientific applications.

1.3 The netCDF File Format
==========================

Until version 3.6.0, all versions of netCDF employed only one binary
data format, now referred to as netCDF classic format. NetCDF classic
is the default format for all versions of netCDF.

   In version 3.6.0 a new binary format was introduced, 64-bit offset
format. Nearly identical to netCDF classic format, it uses 64-bit
offsets (hence the name), and allows users to create far larger
datasets.

   By default, netCDF uses the classic format. To use the 64-bit offset
set the appropriate mode flag when creating the file.

   To achieve network-transparency (machine-independence), netCDF
classic and 64-bit offset formats are implemented in terms of an
external representation much like XDR (eXternal Data Representation, see
`http://www.ietf.org/rfc/rfc1832.txt'), a standard for describing and
encoding data. This representation provides encoding of data into
machine-independent sequences of bits. It has been implemented on a
wide variety of computers, by assuming only that eight-bit bytes can be
encoded and decoded in a consistent way. The IEEE 754 floating-point
standard is used for floating-point data representation.

   Descriptions of the overall structure of netCDF classic and 64-bit
offset files are provided later in this manual. *Note File Structure
and Performance::.

   The details of the classic and 64-bit offset formats are described in
an appendix.  *Note File Format::. However, users are discouraged from
using the format specification to develop independent low-level
software for reading and writing netCDF files, because this could lead
to compatibility problems if the format is ever modified.

1.4 How to Select the Format
============================

The format of a netCDF file is determined at create time. The default
is classic format. The 64-bit offset format will only be used if the
mode parameter to the create function includes the flag for 64-bit
offset format.

   When opening an existing netCDF file the netCDF library will
transparently detect its format and adjust accordingly. However, netCDF
library versions earlier than 3.6.0 cannot read 64-bit offset format
files. NetCDF classic format files (even if created by version 3.6.0 or
later) remain compatible with older versions of the netCDF library.

   Users are encouraged to use netCDF classic format to distribute data,
for maximum portability.

   To select 64-bit offset format files, C programmers should use flag
NC_64BIT_OFFSET in function nc_create. *Note nc_create:
(netcdf-c)nc_create.

   In Fortran 77, use flag nf_64bit_offset in function NF_CREATE. *Note
NF_CREATE: (netcdf-f77)NF_CREATE.

   In Fortran 90, use flag NF90_64BIT_OFFSET in function NF90_CREATE.
*Note NF90_CREATE: (netcdf-f90)NF90_CREATE.

   It is also possible to change the default creation format, to convert
a large body of code without changing every create call. C programmers
see *Note nc_set_default_format: (netcdf-c)nc_set_default_format.
Fortran programs see *Note NF_SET_DEFAULT_FORMAT:
(netcdf-f77)NF_SET_DEFAULT_FORMAT.

1.4.1 NetCDF Classic Format
---------------------------

The original netCDF format is identified using four bytes in the file
header.  All files in this format have "CDF\x01" at the beginning of the
file.  In this documentation this format is referred to as "netCDF
classic format."

   NetCDF classic format is identical to the format used by every
previous version of netCDF.  It has maximum portability, and is still
the default netCDF format.

1.4.2 NetCDF 64-bit Offset Format
---------------------------------

Files with the 64-bit offsets are identified with a "CDF\x02" at the
beginning of the file. In this documentation this format is called
"64-bit offset format."

   For some users, the various 2 and 4 GiB format limitations of the
classic format become a problem. (*note NetCDF Classic Format
Limitations::). For these users, 64-bit offset format is a natural
choice. It greatly eases the size restrictions of netCDF classic files.

   Since 64-bit offset format was introduced in version 3.6.0, earlier
versions of the netCDF library can't read 64-bit offset files.

   Create files in 64-bit offset-4 format by specifying the 64 bit
offset flag when creating a file.

1.5 What about Performance?
===========================

One of the goals of netCDF is to support efficient access to small
subsets of large datasets. To support this goal, netCDF uses direct
access rather than sequential access. This can be much more efficient
when the order in which data is read is different from the order in
which it was written, or when it must be read in different orders for
different applications.

   The amount of overhead for a portable external representation depends
on many factors, including the data type, the type of computer, the
granularity of data access, and how well the implementation has been
tuned to the computer on which it is run. This overhead is typically
small in comparison to the overall resources used by an application. In
any case, the overhead of the external representation layer is usually
a reasonable price to pay for portable data access.

   Although efficiency of data access has been an important concern in
designing and implementing netCDF, it is still possible to use the
netCDF interface to access data in inefficient ways: for example, by
requesting a slice of data that requires a single value from each
record. Advice on how to use the interface efficiently is provided in
*Note File Structure and Performance::.

1.6 Is NetCDF a Good Archive Format?
====================================

NetCDF can be used as a general-purpose archive format for storing
arrays. Compression of data is possible with netCDF (e.g., using arrays
of eight-bit or 16-bit integers to encode low-resolution floating-point
numbers instead of arrays of 32-bit numbers), but the current version
of netCDF was not designed to achieve optimal compression of data.
Hence, using netCDF may require more space than special-purpose archive
formats that exploit knowledge of particular characteristics of
specific datasets.

1.7 Creating Self-Describing Data conforming to Conventions
===========================================================

The mere use of netCDF is not sufficient to make data "self-describing"
and meaningful to both humans and machines. The names of variables and
dimensions should be meaningful and conform to any relevant
conventions. Dimensions should have corresponding coordinate variables
where sensible.

   Attributes play a vital role in providing ancillary information. It
is important to use all the relevant standard attributes using the
relevant conventions. For a description of reserved attributes (used by
the netCDF library) and attribute conventions for generic application
software, see *Note Attribute Conventions::.

   A number of groups have defined their own additional conventions and
styles for netCDF data. Descriptions of these conventions, as well as
examples incorporating them can be accessed from the netCDF Conventions
site, `http://www.unidata.ucar.edu/software/netcdfconventions.html'.

   These conventions should be used where suitable. Additional
conventions are often needed for local use. These should be contributed
to the above netCDF conventions site if likely to interest other users
in similar areas.

1.8 Background and Evolution of the NetCDF Interface
====================================================

The development of the netCDF interface began with a modest goal
related to Unidata's needs: to provide a common interface between
Unidata applications and real-time meteorological data. Since Unidata
software was intended to run on multiple hardware platforms with access
from both C and FORTRAN, achieving Unidata's goals had the potential
for providing a package that was useful in a broader context. By making
the package widely available and collaborating with other organizations
with similar needs, we hoped to improve the then current situation in
which software for scientific data access was only rarely reused by
others in the same discipline and almost never reused between
disciplines (Fulker, 1988).

   Important concepts employed in the netCDF software originated in a
paper (Treinish and Gough, 1987) that described data-access software
developed at the NASA Goddard National Space Science Data Center
(NSSDC). The interface provided by this software was called the Common
Data Format (CDF). The NASA CDF was originally developed as a
platform-specific FORTRAN library to support an abstraction for storing
arrays.

   The NASA CDF package had been used for many different kinds of data
in an extensive collection of applications. It had the virtues of
simplicity (only 13 subroutines), independence from storage format,
generality, ability to support logical user views of data, and support
for generic applications.

   Unidata held a workshop on CDF in Boulder in August 1987. We proposed
exploring the possibility of collaborating with NASA to extend the CDF
FORTRAN interface, to define a C interface, and to permit the access of
data aggregates with a single call, while maintaining compatibility
with the existing NASA interface.

   Independently, Dave Raymond at the New Mexico Institute of Mining and
Technology had developed a package of C software for UNIX that
supported sequential access to self-describing array-oriented data and
a "pipes and filters" (or "data flow") approach to processing,
analyzing, and displaying the data. This package also used the "Common
Data Format" name, later changed to C-Based Analysis and Display System
(CANDIS). Unidata learned of Raymond's work (Raymond, 1988), and
incorporated some of his ideas, such as the use of named dimensions and
variables with differing shapes in a single data object, into the
Unidata netCDF interface.

   In early 1988, Glenn Davis of Unidata developed a prototype netCDF
package in C that was layered on XDR. This prototype proved that a
single-file, XDR-based implementation of the CDF interface could be
achieved at acceptable cost and that the resulting programs could be
implemented on both UNIX and VMS systems. However, it also demonstrated
that providing a small, portable, and NASA CDF-compatible FORTRAN
interface with the desired generality was not practical. NASA's CDF and
Unidata's netCDF have since evolved separately, but recent CDF versions
share many characteristics with netCDF.

   In early 1988, Joe Fahle of SeaSpace, Inc. (a commercial software
development firm in San Diego, California), a participant in the 1987
Unidata CDF workshop, independently developed a CDF package in C that
extended the NASA CDF interface in several important ways (Fahle,
1989). Like Raymond's package, the SeaSpace CDF software permitted
variables with unrelated shapes to be included in the same data object
and permitted a general form of access to multidimensional arrays.
Fahle's implementation was used at SeaSpace as the intermediate form of
storage for a variety of steps in their image-processing system. This
interface and format have subsequently evolved into the Terascan data
format.

   After studying Fahle's interface, we concluded that it solved many of
the problems we had identified in trying to stretch the NASA interface
to our purposes. In August 1988, we convened a small workshop to agree
on a Unidata netCDF interface, and to resolve remaining open issues.
Attending were Joe Fahle of SeaSpace, Michael Gough of Apple (an author
of the NASA CDF software), Angel Li of the University of Miami (who had
implemented our prototype netCDF software on VMS and was a potential
user), and Unidata systems development staff. Consensus was reached at
the workshop after some further simplifications were discovered. A
document incorporating the results of the workshop into a proposed
Unidata netCDF interface specification was distributed widely for
comments before Glenn Davis and Russ Rew implemented the first version
of the software. Comparison with other data-access interfaces and
experience using netCDF are discussed in Rew and Davis (1990a), Rew and
Davis (1990b), Jenter and Signell (1992), and Brown, Folk, Goucher, and
Rew (1993).

   In October 1991, we announced version 2.0 of the netCDF software
distribution. Slight modifications to the C interface (declaring
dimension lengths to be long rather than int) improved the usability of
netCDF on inexpensive platforms such as MS-DOS computers, without
requiring recompilation on other platforms. This change to the
interface required no changes to the associated file format.

   Release of netCDF version 2.3 in June 1993 preserved the same file
format but added single call access to records, optimizations for
accessing cross-sections involving non-contiguous data, subsampling
along specified dimensions (using 'strides'), accessing non-contiguous
data (using 'mapped array sections'), improvements to the ncdump and
ncgen utilities, and an experimental C++ interface.

   In version 2.4, released in February 1996, support was added for new
platforms and for the C++ interface, significant optimizations were
implemented for supercomputer architectures, and the file format was
formally specified in an appendix to the User's Guide.

   FAN (File Array Notation), software providing a high-level interface
to netCDF data, was made available in May 1996. The capabilities of the
FAN utilities include extracting and manipulating array data from
netCDF datasets, printing selected data from netCDF arrays, copying
ASCII data into netCDF arrays, and performing various operations (sum,
mean, max, min, product, and others) on netCDF arrays.

   In 1996 and 1997, Joe Sirott implemented and made available the first
implementation of a read-only netCDF interface for Java, Bill Noon made
a Python module available for netCDF, and Konrad Hinsen contributed
another netCDF interface for Python.

   In May 1997, Version 3.3 of netCDF was released. This included a new
type-safe interface for C and Fortran, as well as many other
improvements.  A month later, Charlie Zender released version 1.0 of
the NCO (netCDF Operators) package, providing command-line utilities
for general purpose operations on netCDF data.

   Version 3.4 of Unidata's netCDF software, released in March 1998,
included initial large file support, performance enhancements, and
improved Cray platform support.  Later in 1998, Dan Schmitt provided a
Tcl/Tk interface, and Glenn Davis provided version 1.0 of netCDF for
Java.

   In May 1999, Glenn Davis, who was instrumental in creating and
developing netCDF, died in a small plane crash during a thunderstorm.
The memory of Glenn's passions and intellect continue to inspire those
of us who worked with him.

   In February 2000, an experimental Fortran 90 interface developed by
Robert Pincus was released.

   John Caron released netCDF for Java, version 2.0 in February 2001.
This version incorporated a new high-performance package for
multidimensional arrays, simplified the interface, and included OPeNDAP
(known previously as DODS) remote access, as well as remote netCDF
access via HTTP contributed by Don Denbo.

   In March 2001, NetCDF 3.5.0 was released. This release fully
integrated the new Fortran 90 interface, enhanced portability, improved
the C++ interface, and added a few new tuning functions.

   Also in 2001, Takeshi Horinouchi and colleagues made a netCDF
interface for Ruby available, as did David Pierce for the R language
for statistical computing and graphics.  Charles Denham released
WetCDF, an independent implementation of the netCDF interface for
MATLAB, as well as updates to the popular netCDF Toolbox for MATLAB.

   In 2002, Unidata and collaborators developed NcML, an XML
representation for netCDF data useful for cataloging data holdings,
aggregation of data from multiple datasets, augmenting metadata in
existing datasets, and support for alternative views of data.  The Java
interface currently provides access to netCDF data through NcML.

   Additional developments in 2002 included translation of C and Fortran
User Guides into Japanese by Masato Shiotani and colleagues, creation
of a "Best Practices" guide for writing netCDF files, and provision of
an Ada-95 interface by Alexandru Corlan.

   In July 2003 a group of researchers at Northwestern University and
Argonne National Laboratory (Jianwei Li, Wei-keng Liao, Alok Choudhary,
Robert Ross, Rajeev Thakur, William Gropp, and Rob Latham) contributed
a new parallel interface for writing and reading netCDF data, tailored
for use on high performance platforms with parallel I/O. The
implementation built on the MPI-IO interface, providing portability to
many platforms.

   In October 2003, Greg Sjaardema contributed support for an
alternative format with 64-bit offsets, to provide more complete
support for very large files. These changes, with slight modifications
at Unidata, were incorporated into version 3.6.0, released in December,
2004.

   In 2004, thanks to a NASA grant, Unidata and NCSA began a
collaboration to increase the interoperability of netCDF and HDF5, and
bring some advanced HDF5 features to netCDF users. This resulted in
netCDF-4.0, currently (as of March, 2006) in alpha release.

1.9 What's New Since the Previous Release?
==========================================

This Guide documents the 3.6.3 release of netCDF, which fixes some bugs
and improves documentation and portability.

1.10 Limitations of NetCDF
==========================

The netCDF data model is widely applicable to data that can be
organized into a collection of named array variables with named
attributes, but there are some important limitations to the model and
its implementation in software.

   Currently, netCDF offers a limited number of external numeric data
types: 8-, 16-, 32-bit integers, or 32- or 64-bit floating-point
numbers. This limited set of sizes may use file space inefficiently
compared to packing data in bit fields. For example, arrays of 9-bit
values must be stored in 16-bit short integers. Storing arrays of 1- or
2-bit values in 8-bit values is even less optimal.

   With the classic netCDF file format, there are constraints that limit
how a dataset is structured to store more than 2 "GiBytes" (2^30 or
1,073,741,824 bytes, as compared to a "Gbyte", which is 1,000,000,000
bytes) of data in a single netCDF dataset. (*note NetCDF Classic Format
Limitations::).  This limitation is a result of 32-bit offsets used for
storing relative offsets within a classic netCDF format file. Since one
of the goals of netCDF is portable data and some computing platforms
still can't deal with files larger than 2 GiB, it is best to keep files
that must be portable below this limit. Nevertheless, it is possible to
create and access netCDF files larger than 2 GiB on platforms that
provide support for such files (*note Large File Support::).

   The new 64-bit offset format (introduced version 3.6.0) allows large
files, and makes it easy to create to create fixed variables of about 4
GiB, and record variables of about 4 GiB per record. (*note NetCDF 64
bit Offset Format Limitations::). However, old netCDF applications will
not be able to read the 64-bit offset files until they are upgraded to
at least version 3.6.0 of netCDF.

   Another limitation of the classic (and 64-bit offset) model is that
only one unlimited (changeable) dimension is permitted for each netCDF
data set. Multiple variables can share an unlimited dimension, but then
they must all grow together. Hence the netCDF model does not permit
variables with several unlimited dimensions or the use of multiple
unlimited dimensions in different variables within the same dataset.
Variables that have non-rectangular shapes (for example, ragged arrays)
cannot be represented conveniently.

   The extent to which data can be completely self-describing is
limited: there is always some assumed context without which sharing and
archiving data would be impractical. NetCDF permits storing meaningful
names for variables, dimensions, and attributes; units of measure in a
form that can be used in computations; text strings for attribute
values that apply to an entire data set; and simple kinds of coordinate
system information. But for more complex kinds of metadata (for
example, the information necessary to provide accurate georeferencing
of data on unusual grids or from satellite images), it is often
necessary to develop conventions.

   Specific additions to the netCDF data model might make some of these
conventions unnecessary or allow some forms of metadata to be
represented in a uniform and compact way. For example, adding explicit
georeferencing to the netCDF data model would simplify elaborate
georeferencing conventions at the cost of complicating the model. The
problem is finding an appropriate trade-off between the richness of the
model and its generality (i.e., its ability to encompass many kinds of
data). A data model tailored to capture the shared context among
researchers within one discipline may not be appropriate for sharing or
combining data from multiple disciplines.

   The classic netCDF data model does not support nested data structures
such as trees, nested arrays, or other recursive structures. (This
limitation also applies to 64-bit offset files.) Through use of
indirection and conventions it is possible to represent some kinds of
nested structures, but the result may fall short of the netCDF goal of
self-describing data.

   Finally, concurrent access to a netCDF dataset is limited. One writer
and multiple readers may access data in a single dataset
simultaneously, but there is no support for multiple concurrent writers.

1.11 Plans for NetCDF
=====================

NetCDF-4.0 is currently available in alpha release. It allows the use
of HDF5 files from the netCDF API. New features available with
netCDF-4/HDF5 files include:

   * The user of groups to organize datasets.

   * New unsigned integer data types, 64-bit integer types, and a string
     type.

   * A user defined compound type, which can be constructed by users to
     match a C struct or other arbitrary organization of types.

   * A variable length array type.

   * Support for parallel I/O.


   NetCDF-4 also contains a complete upgrade of the installation and
build system, including the use of the libtool package to build shared
libraries on systems which support them.

   More information about netCDF-4 can be found at the netCDF-4 web
page `http://www.unidata.ucar.edu/software/netcdf/netcdf-4'.

   For more information about HDF5, see the HDF5 web site:
`http://hdf.ncsa.uiuc.edu/HDF5'.

1.12 References
===============

  1. Brown, S. A, M. Folk, G. Goucher, and R. Rew, "Software for
     Portable Scientific Data Management," Computers in Physics,
     American Institute of Physics, Vol. 7, No. 3, May/June 1993.

  2. Davies, H. L., "FAN - An array-oriented query language," Second
     Workshop on Database Issues for Data Visualization (Visualization
     1995), Atlanta, Georgia, IEEE, October 1995.

  3. Fahle, J., TeraScan Applications Programming Interface, SeaSpace,
     San Diego, California, 1989.

  4. Fulker, D. W., "The netCDF: Self-Describing, Portable Files---a
     Basis for 'Plug-Compatible' Software Modules Connectable by
     Networks," ICSU Workshop on Geophysical Informatics, Moscow, USSR,
     August 1988.

  5. Fulker, D. W., "Unidata Strawman for Storing Earth-Referencing
     Data," Seventh International Conference on Interactive Information
     and Processing Systems for Meteorology, Oceanography, and
     Hydrology, New Orleans, La., American Meteorology Society, January
     1991.

  6. Gough, M. L., NSSDC CDF Implementor's Guide (DEC VAX/VMS) Version
     1.1, National Space Science Data Center, 88-17, NASA/Goddard Space
     Flight Center, 1988.

  7. Jenter, H. L. and R. P. Signell, "NetCDF: A Freely-Available
     Software-Solution to Data-Access Problems for Numerical Modelers,"
     Proceedings of the American Society of Civil Engineers Conference
     on Estuarine and Coastal Modeling, Tampa, Florida, 1992.

  8. Raymond, D. J., "A C Language-Based Modular System for Analyzing
     and Displaying Gridded Numerical Data," Journal of Atmospheric and
     Oceanic Technology, 5, 501-511, 1988.

  9. Rew, R. K. and G. P. Davis, "The Unidata netCDF: Software for
     Scientific Data Access," Sixth International Conference on
     Interactive Information and Processing Systems for Meteorology,
     Oceanography, and Hydrology, Anaheim, California, American
     Meteorology Society, February 1990.

 10. Rew, R. K. and G. P. Davis, "NetCDF: An Interface for Scientific
     Data Access," Computer Graphics and Applications, IEEE, pp. 76-82,
     July 1990.

 11. Rew, R. K. and G. P. Davis, "Unidata's netCDF Interface for Data
     Access: Status and Plans," Thirteenth International Conference on
     Interactive Information and Processing Systems for Meteorology,
     Oceanography, and Hydrology, Anaheim, California, American
     Meteorology Society, February 1997.

 12. Treinish, L. A. and M. L. Gough, "A Software Package for the Data
     Independent Management of Multi-Dimensional Data," EOS
     Transactions, American Geophysical Union, 68, 633-635, 1987.

2 Components of a NetCDF Dataset
********************************

2.1 The NetCDF Data Model
=========================

A netCDF dataset contains dimensions, variables, and attributes, which
all have both a name and an ID number by which they are identified.
These components can be used together to capture the meaning of data
and relations among data fields in an array-oriented dataset. The
netCDF library allows simultaneous access to multiple netCDF datasets
which are identified by dataset ID numbers, in addition to ordinary
file names.

2.1.1 Naming Conventions
------------------------

The names of dimensions, variables and attributes must conform to the
definition of a legal identifier.  A legal identifier is any sequence
of one or more alphabetic characters, digits, and the following special
characters: '_', '.', '-', '@', and '+'. The identifier must, however,
start with an alphabetic character or underscore. Case is significant
and names commencing with underscore are reserved for system use.

   Note specifically that the extended ISO8859 character set
(characters in the range 0x80-0xFF) are specifically excluded because
they conflict with the UTF8 format (if that is enabled).

   If UTF8 support is enabled, then CDL names may contain any legal
UTF8 multibyte character.

   The set of legal identifier characters (after the first character)
has been unofficially extended to include the following characters:
':', '(', and ')'.  The use of these characters is deprecated and
conforming files should not use them.  They may declared illegal in the
future and are currently not supported in ncgen.

2.1.2 Network Common Data Form Language (CDL)
---------------------------------------------

We will use a small netCDF example to illustrate the concepts of the
netCDF data model. This includes dimensions, variables, and attributes.
The notation used to describe this simple netCDF object is called CDL
(network Common Data form Language), which provides a convenient way of
describing netCDF datasets. The netCDF system includes utilities for
producing human-oriented CDL text files from binary netCDF datasets and
vice versa.

     netcdf example_1 {  // example of CDL notation for a netCDF dataset

     dimensions:         // dimension names and lengths are declared first
             lat = 5, lon = 10, level = 4, time = unlimited;

     variables:          // variable types, names, shapes, attributes
             float   temp(time,level,lat,lon);
                         temp:long_name     = "temperature";
                         temp:units         = "celsius";
             float   rh(time,lat,lon);
                         rh:long_name = "relative humidity";
                         rh:valid_range = 0.0, 1.0;      // min and max
             int     lat(lat), lon(lon), level(level);
                         lat:units       = "degrees_north";
                         lon:units       = "degrees_east";
                         level:units     = "millibars";
             short   time(time);
                         time:units      = "hours since 1996-1-1";
             // global attributes
                         :source = "Fictional Model Output";

     data:                // optional data assignments
             level   = 1000, 850, 700, 500;
             lat     = 20, 30, 40, 50, 60;
             lon     = -160,-140,-118,-96,-84,-52,-45,-35,-25,-15;
             time    = 12;
             rh      =.5,.2,.4,.2,.3,.2,.4,.5,.6,.7,
                      .1,.3,.1,.1,.1,.1,.5,.7,.8,.8,
                      .1,.2,.2,.2,.2,.5,.7,.8,.9,.9,
                      .1,.2,.3,.3,.3,.3,.7,.8,.9,.9,
                       0,.1,.2,.4,.4,.4,.4,.7,.9,.9;
     }

   The CDL notation for a netCDF dataset can be generated automatically
by using ncdump, a utility program described later (*note ncdump::).
Another netCDF utility, ncgen, generates a netCDF dataset (or
optionally C or FORTRAN source code containing calls needed to produce
a netCDF dataset) from CDL input (*note ncgen::).

   The CDL notation is simple and largely self-explanatory. It will be
explained more fully as we describe the components of a netCDF dataset.
For now, note that CDL statements are terminated by a semicolon.
Spaces, tabs, and newlines can be used freely for readability. Comments
in CDL follow the characters '//' on any line. A CDL description of a
netCDF dataset takes the form

       netCDF name {
         dimensions: ...
         variables: ...
         data: ...
       }

   where the name is used only as a default in constructing file names
by the ncgen utility. The CDL description consists of three optional
parts, introduced by the keywords dimensions, variables, and data.
NetCDF dimension declarations appear after the dimensions keyword,
netCDF variables and attributes are defined after the variables
keyword, and variable data assignments appear after the data keyword.

   The ncgen utility provides a command line option which indicates the
desired output format. Limitations are enforced for the selected format
- that is, some cdl files may be expressible only in 64-bit offset
format.

   For example, trying to create a file with very large variables in
classic format may result in an error because size limits are violated.

2.2 Dimensions
==============

A dimension may be used to represent a real physical dimension, for
example, time, latitude, longitude, or height. A dimension might also
be used to index other quantities, for example station or
model-run-number.

   A netCDF dimension has both a name and a length.

   A dimension length is an arbitrary positive integer, except that one
dimension can have the length UNLIMITED.

   Such a dimension is called the unlimited dimension or the record
dimension. A variable with an unlimited dimension can grow to any
length along that dimension. The unlimited dimension index is like a
record number in conventional record-oriented files.

   A netCDF classic or 64-bit offset dataset can have at most one
unlimited dimension, but need not have any. If a variable has an
unlimited dimension, that dimension must be the most significant
(slowest changing) one. Thus any unlimited dimension must be the first
dimension in a CDL shape and the first dimension in corresponding C
array declarations.

   To grow variables along an unlimited dimension, write the data using
any of the netCDF data writing functions, and specify the index of the
unlimited dimension to the desired record number. The netCDF library
will write however many records are needed (using the fill value,
unless turned off) to fill in any intervening records.

   CDL dimension declarations may appear on one or more lines following
the CDL keyword dimensions. Multiple dimension declarations on the same
line may be separated by commas. Each declaration is of the form name =
length.

   There are four dimensions in the above example: lat, lon, level, and
time (*note The NetCDF Data Model::). The first three are assigned
fixed lengths; time is assigned the length UNLIMITED, which means it is
the unlimited dimension.

   The basic unit of named data in a netCDF dataset is a variable. When
a variable is defined, its shape is specified as a list of dimensions.
These dimensions must already exist. The number of dimensions is called
the rank (a.k.a. dimensionality). A scalar variable has rank 0, a
vector has rank 1 and a matrix has rank 2.

   It is possible (since version 3.1 of netCDF) to use the same
dimension more than once in specifying a variable shape. For example,
correlation(instrument, instrument) could be a matrix giving
correlations between measurements using different instruments. But data
whose dimensions correspond to those of physical space/time should have
a shape comprising different dimensions, even if some of these have the
same length.

2.3 Variables
=============

Variables are used to store the bulk of the data in a netCDF dataset. A
variable represents an array of values of the same type. A scalar value
is treated as a 0-dimensional array. A variable has a name, a data
type, and a shape described by its list of dimensions specified when
the variable is created. A variable may also have associated
attributes, which may be added, deleted or changed after the variable
is created.

   A variable external data type is one of a small set of netCDF types.
In classic and 64-bit offset files, only the original six types are
available (byte, character, short, int, float, and double).

   For more information on types for the C interface, *Note Variable
Types: (netcdf-c)Variable Types.

   For more information on types for the Fortran interface, *Note
Variable Types: (netcdf-f77)Variable Types.

   In the CDL notation, only classic and 64-bit offset type can be
used. They are given the simpler names byte, char, short, int, float,
and double. real may be used as a synonym for float in the CDL
notation. long is a deprecated synonym for int. For the exact meaning
of each of the types see *Note netCDF external data types::.

   CDL variable declarations appear after the variable keyword in a CDL
unit. They have the form

          type variable_name ( dim_name_1, dim_name_2, ... );

   for variables with dimensions, or

          type variable_name;

   for scalar variables.

   In the above CDL example there are six variables. As discussed below,
four of these are coordinate variables. The remaining variables
(sometimes called primary variables), temp and rh, contain what is
usually thought of as the data. Each of these variables has the
unlimited dimension time as its first dimension, so they are called
record variables. A variable that is not a record variable has a fixed
length (number of data values) given by the product of its dimension
lengths. The length of a record variable is also the product of its
dimension lengths, but in this case the product is variable because it
involves the length of the unlimited dimension, which can vary. The
length of the unlimited dimension is the number of records.

2.3.1 Coordinate Variables
--------------------------

It is legal for a variable to have the same name as a dimension. Such
variables have no special meaning to the netCDF library. However there
is a convention that such variables should be treated in a special way
by software using this library.

   A variable with the same name as a dimension is called a coordinate
variable. It typically defines a physical coordinate corresponding to
that dimension. The above CDL example includes the coordinate variables
lat, lon, level and time, defined as follows:

             int     lat(lat), lon(lon), level(level);
             short   time(time);
     ...
     data:
             level   = 1000, 850, 700, 500;
             lat     = 20, 30, 40, 50, 60;
             lon     = -160,-140,-118,-96,-84,-52,-45,-35,-25,-15;
             time    = 12;

   These define the latitudes, longitudes, barometric pressures and
times corresponding to positions along these dimensions. Thus there is
data at altitudes corresponding to 1000, 850, 700 and 500 millibars;
and at latitudes 20, 30, 40, 50 and 60 degrees north. Note that each
coordinate variable is a vector and has a shape consisting of just the
dimension with the same name.

   A position along a dimension can be specified using an index. This is
an integer with a minimum value of 0 for C programs, 1 in Fortran
programs. Thus the 700 millibar level would have an index value of 2 in
the example above in a C program, and 3 in a Fortran program.

   If a dimension has a corresponding coordinate variable, then this
provides an alternative, and often more convenient, means of specifying
position along it. Current application packages that make use of
coordinate variables commonly assume they are numeric vectors and
strictly monotonic (all values are different and either increasing or
decreasing).

2.4 Attributes
==============

NetCDF attributes are used to store data about the data (ancillary data
or metadata), similar in many ways to the information stored in data
dictionaries and schema in conventional database systems. Most
attributes provide information about a specific variable. These are
identified by the name (or ID) of that variable, together with the name
of the attribute.

   Some attributes provide information about the dataset as a whole and
are called global attributes. These are identified by the attribute
name together with a blank variable name (in CDL) or a special null
"global variable" ID (in C or Fortran).

   An attribute has an associated variable (the null "global variable"
for a global or group-level attribute), a name, a data type, a length,
and a value. The current version treats all attributes as vectors;
scalar values are treated as single-element vectors.

   Conventional attribute names should be used where applicable. New
names should be as meaningful as possible.

   The external type of an attribute is specified when it is created.
The types permitted for attributes are the same as the netCDF external
data types for variables. Attributes with the same name for different
variables should sometimes be of different types. For example, the
attribute valid_max specifying the maximum valid data value for a
variable of type int should be of type int, whereas the attribute
valid_max for a variable of type double should instead be of type
double.

   Attributes are more dynamic than variables or dimensions; they can be
deleted and have their type, length, and values changed after they are
created, whereas the netCDF interface provides no way to delete a
variable or to change its type or shape.

   The CDL notation for defining an attribute is

         variable_name:attribute_name = list_of_values;

   for a variable attribute, or

         :attribute_name = list_of_values;

   for a global attribute.

   The type and length of each attribute are not explicitly declared in
CDL; they are derived from the values assigned to the attribute. All
values of an attribute must be of the same type. The notation used for
constant values of the various netCDF types is discussed later (*note
CDL Notation for Data Constants::).

   In the netCDF example (*note The NetCDF Data Model::), units is an
attribute for the variable lat that has a 13-character array value
'degrees_north'. And valid_range is an attribute for the variable rh
that has length 2 and values '0.0' and '1.0'.

   One global attribute, called "source", is defined for the example
netCDF dataset. This is a character array intended for documenting the
data. Actual netCDF datasets might have more global attributes to
document the origin, history, conventions, and other characteristics of
the dataset as a whole.

   Most generic applications that process netCDF datasets assume
standard attribute conventions and it is strongly recommended that
these be followed unless there are good reasons for not doing so. For
information about units, long_name, valid_min, valid_max, valid_range,
scale_factor, add_offset, _FillValue, and other conventional
attributes, see *Note Attribute Conventions::.

   Attributes may be added to a netCDF dataset long after it is first
defined, so you don't have to anticipate all potentially useful
attributes. However adding new attributes to an existing classic or
64-bit offset format dataset can incur the same expense as copying the
dataset. For a more extensive discussion see *Note File Structure and
Performance::.

2.5 Differences between Attributes and Variables
================================================

In contrast to variables, which are intended for bulk data, attributes
are intended for ancillary data, or information about the data. The
total amount of ancillary data associated with a netCDF object, and
stored in its attributes, is typically small enough to be
memory-resident. However variables are often too large to entirely fit
in memory and must be split into sections for processing.

   Another difference between attributes and variables is that variables
may be multidimensional. Attributes are all either scalars
(single-valued) or vectors (a single, fixed dimension).

   Variables are created with a name, type, and shape before they are
assigned data values, so a variable may exist with no values.  The
value of an attribute is specified when it is created, unless it is a
zero-length attribute.

   A variable may have attributes, but an attribute cannot have
attributes. Attributes assigned to variables may have the same units as
the variable (for example, valid_range) or have no units (for example,
scale_factor). If you want to store data that requires units different
from those of the associated variable, it is better to use a variable
than an attribute. More generally, if data require ancillary data to
describe them, are multidimensional, require any of the defined netCDF
dimensions to index their values, or require a significant amount of
storage, that data should be represented using variables rather than
attributes.

3 Data
******

This chapter discusses the primitive netCDF external data types, the
kinds of data access supported by the netCDF interface, and how data
structures other than arrays may be implemented in a netCDF dataset.

3.1 netCDF external data types
==============================

The atomic external types supported by the netCDF interface are:

C name      Fortran     storage
            name        
NC_BYTE     nf_byte     8-bit signed integer
NC_CHAR     nf_char     8-bit unsigned integer
NC_SHORT    nf_short    16-bit signed integer
NC_INT (or  nf_int      32-bit signed integer
NC_LONG)                
NC_FLOAT    nf_float    32-bit floating point
NC_DOUBLE   nf_double   64-bit floating point

   These types were chosen to provide a reasonably wide range of
trade-offs between data precision and number of bits required for each
value. These external data types are independent from whatever internal
data types are supported by a particular machine and language
combination.

   These types are called "external", because they correspond to the
portable external representation for netCDF data. When a program reads
external netCDF data into an internal variable, the data is converted,
if necessary, into the specified internal type. Similarly, if you write
internal data into a netCDF variable, this may cause it to be converted
to a different external type, if the external type for the netCDF
variable differs from the internal type.

   The separation of external and internal types and automatic type
conversion have several advantages. You need not be aware of the
external type of numeric variables, since automatic conversion to or
from any desired numeric type is available. You can use this feature to
simplify code, by making it independent of external types, using a
sufficiently wide internal type, e.g., double precision, for numeric
netCDF data of several different external types. Programs need not be
changed to accommodate a change to the external type of a variable.

   If conversion to or from an external numeric type is necessary, it is
handled by the library.

   Converting from one numeric type to another may result in an error if
the target type is not capable of representing the converted value. For
example, an internal short integer type may not be able to hold data
stored externally as an integer. When accessing an array of values, a
range error is returned if one or more values are out of the range of
representable values, but other values are converted properly.

   Note that mere loss of precision in type conversion does not return
an error. Thus, if you read double precision values into a
single-precision floating-point variable, for example, no error results
unless the magnitude of the double precision value exceeds the
representable range of single-precision floating point numbers on your
platform. Similarly, if you read a large integer into a float incapable
of representing all the bits of the integer in its mantissa, this loss
of precision will not result in an error. If you want to avoid such
precision loss, check the external types of the variables you access to
make sure you use an internal type that has adequate precision.

   The names for the primitive external data types (byte, char, short,
ushort, int, uint, int64, uint64, float or real, double, bool, string)
are reserved words in CDL, so the names of variables, dimensions, and
attributes must not be type names.

   It is possible to interpret byte data as either signed (-128 to 127)
or unsigned (0 to 255). However, when reading byte data to be converted
into other numeric types, it is interpreted as signed.

   For the correspondence between netCDF external data types and the
data types of a language see *Note Variables::.

3.2 Data Access
===============

To access (read or write) netCDF data you specify an open netCDF
dataset, a netCDF variable, and information (e.g., indices) identifying
elements of the variable. The name of the access function corresponds
to the internal type of the data. If the internal type has a different
representation from the external type of the variable, a conversion
between the internal type and external type will take place when the
data is read or written.

   Access to data in classic and 64-bit offset format is direct. In
either case you can access a small subset of data from a large dataset
efficiently, without first accessing all the data that precedes it.

   Reading and writing data by specifying a variable, instead of a
position in a file, makes data access independent of how many other
variables are in the dataset, making programs immune to data format
changes that involve adding more variables to the data.

   In the C and FORTRAN interfaces, datasets are not specified by name
every time you want to access data, but instead by a small integer
called a dataset ID, obtained when the dataset is first created or
opened.

   Similarly, a variable is not specified by name for every data access
either, but by a variable ID, a small integer used to identify each
variable in a netCDF dataset.

3.2.1 Forms of Data Access
--------------------------

The netCDF interface supports several forms of direct access to data
values in an open netCDF dataset. We describe each of these forms of
access in order of increasing generality:

   * access to all elements;

   * access to individual elements, specified with an index vector;

   * access to array sections, specified with an index vector, and count
     vector;

   * access to sub-sampled array sections, specified with an index
     vector, count vector, and stride vector; and

   * access to mapped array sections, specified with an index vector,
     count vector, stride vector, and an index mapping vector.


   The four types of vector (index vector, count vector, stride vector
and index mapping vector) each have one element for each dimension of
the variable. Thus, for an n-dimensional variable (rank = n), n-element
vectors are needed. If the variable is a scalar (no dimensions), these
vectors are ignored.

   An array section is a "slab" or contiguous rectangular block that is
specified by two vectors. The index vector gives the indices of the
element in the corner closest to the origin. The count vector gives the
lengths of the edges of the slab along each of the variable's
dimensions, in order. The number of values accessed is the product of
these edge lengths.

   A subsampled array section is similar to an array section, except
that an additional stride vector is used to specify sampling. This
vector has an element for each dimension giving the length of the
strides to be taken along that dimension. For example, a stride of 4
means every fourth value along the corresponding dimension. The total
number of values accessed is again the product of the elements of the
count vector.

   A mapped array section is similar to a subsampled array section
except that an additional index mapping vector allows one to specify
how data values associated with the netCDF variable are arranged in
memory. The offset of each value from the reference location, is given
by the sum of the products of each index (of the imaginary internal
array which would be used if there were no mapping) by the
corresponding element of the index mapping vector. The number of values
accessed is the same as for a subsampled array section.

   The use of mapped array sections is discussed more fully below, but
first we present an example of the more commonly used array-section
access.

3.2.2 A C Example of Array-Section Access
-----------------------------------------

Assume that in our earlier example of a netCDF dataset (*note Network
Common Data Form Language (CDL): The NetCDF Data Model.), we wish to
read a cross-section of all the data for the temp variable at one level
(say, the second), and assume that there are currently three records
(time values) in the netCDF dataset. Recall that the dimensions are
defined as

       lat = 5, lon = 10, level = 4, time = unlimited;

   and the variable temp is declared as

       float   temp(time, level, lat, lon);

   in the CDL notation.

   A corresponding C variable that holds data for only one level might
be declared as

     #define LATS  5
     #define LONS 10
     #define LEVELS 1
     #define TIMES 3                 /* currently */
         ...
     float   temp[TIMES*LEVELS*LATS*LONS];

   to keep the data in a one-dimensional array, or

         ...
     float   temp[TIMES][LEVELS][LATS][LONS];

   using a multidimensional array declaration.

   To specify the block of data that represents just the second level,
all times, all latitudes, and all longitudes, we need to provide a
start index and some edge lengths. The start index should be (0, 1, 0,
0) in C, because we want to start at the beginning of each of the time,
lon, and lat dimensions, but we want to begin at the second value of
the level dimension. The edge lengths should be (3, 1, 5, 10) in C,
(since we want to get data for all three time values, only one level
value, all five lat values, and all 10 lon values. We should expect to
get a total of 150 floating-point values returned (3 * 1 * 5 * 10), and
should provide enough space in our array for this many. The order in
which the data will be returned is with the last dimension, lon,
varying fastest:

          temp[0][1][0][0]
          temp[0][1][0][1]
          temp[0][1][0][2]
          temp[0][1][0][3]

                ...

          temp[2][1][4][7]
          temp[2][1][4][8]
          temp[2][1][4][9]

   Different dimension orders for the C, FORTRAN, or other language
interfaces do not reflect a different order for values stored on the
disk, but merely different orders supported by the procedural
interfaces to the languages. In general, it does not matter whether a
netCDF dataset is written using the C, FORTRAN, or another language
interface; netCDF datasets written from any supported language may be
read by programs written in other supported languages.

3.2.3 More on General Array Section Access for C
------------------------------------------------

The use of mapped array sections allows non-trivial relationships
between the disk addresses of variable elements and the addresses where
they are stored in memory. For example, a matrix in memory could be the
transpose of that on disk, giving a quite different order of elements.
In a regular array section, the mapping between the disk and memory
addresses is trivial: the structure of the in-memory values (i.e., the
dimensional lengths and their order) is identical to that of the array
section. In a mapped array section, however, an index mapping vector is
used to define the mapping between indices of netCDF variable elements
and their memory addresses.

   With mapped array access, the offset (number of array elements) from
the origin of a memory-resident array to a particular point is given by
the inner product[1] of the index mapping vector with the point's
coordinate offset vector. A point's coordinate offset vector gives, for
each dimension, the offset from the origin of the containing array to
the point.In C, a point's coordinate offset vector is the same as its
coordinate vector.

   The index mapping vector for a regular array section would have--in
order from most rapidly varying dimension to most slowly--a constant 1,
the product of that value with the edge length of the most rapidly
varying dimension of the array section, then the product of that value
with the edge length of the next most rapidly varying dimension, and so
on. In a mapped array, however, the correspondence between netCDF
variable disk locations and memory locations can be different.

   For example, the following C definitions

     struct vel {
         int flags;
         float u;
         float v;
     } vel[NX][NY];
     ptrdiff_t imap[2] = {
         sizeof(struct vel),
         sizeof(struct vel)*NY
     };

   where imap is the index mapping vector, can be used to access the
memory-resident values of the netCDF variable, vel(NY,NX), even though
the dimensions are transposed and the data is contained in a 2-D array
of structures rather than a 2-D array of floating-point values.

   A detailed example of mapped array access is presented in the
description of the interfaces for mapped array access. *Note Write a
Mapped Array of Values - nc_put_varm_ type: (netcdf-c)nc_put_varm_ type.

   Note that, although the netCDF abstraction allows the use of
subsampled or mapped array-section access there use is not required. If
you do not need these more general forms of access, you may ignore
these capabilities and use single value access or regular array section
access instead.

3.2.4 A Fortran Example of Array-Section Access
-----------------------------------------------

Assume that in our earlier example of a netCDF dataset (*note The
NetCDF Data Model::), we wish to read a cross-section of all the data
for the temp variable at one level (say, the second), and assume that
there are currently three records (time values) in the netCDF dataset.
Recall that the dimensions are defined as

       lat = 5, lon = 10, level = 4, time = unlimited;

   and the variable temp is declared as

       float   temp(time, level, lat, lon);

   in the CDL notation.

   In FORTRAN, the dimensions are reversed from the CDL declaration with
the first dimension varying fastest and the record dimension as the
last dimension of a record variable. Thus a FORTRAN declarations for a
variable that holds data for only one level is

     INTEGER LATS, LONS, LEVELS, TIMES
     PARAMETER (LATS=5, LONS=10, LEVELS=1, TIMES=3)
        ...
     REAL TEMP(LONS, LATS, LEVELS, TIMES)

   To specify the block of data that represents just the second level,
all times, all latitudes, and all longitudes, we need to provide a
start index and some edge lengths. The start index should be (1, 1, 2,
1) in FORTRAN, because we want to start at the beginning of each of the
time, lon, and lat dimensions, but we want to begin at the second value
of the level dimension. The edge lengths should be (10, 5, 1, 3) in
FORTRAN, since we want to get data for all three time values, only one
level value, all five lat values, and all 10 lon values. We should
expect to get a total of 150 floating-point values returned (3 * 1 * 5
* 10), and should provide enough space in our array for this many. The
order in which the data will be returned is with the first dimension,
LON, varying fastest:

          TEMP( 1, 1, 2, 1)
          TEMP( 2, 1, 2, 1)
          TEMP( 3, 1, 2, 1)
          TEMP( 4, 1, 2, 1)

                ...

          TEMP( 8, 5, 2, 3)
          TEMP( 9, 5, 2, 3)
          TEMP(10, 5, 2, 3)

   Different dimension orders for the C, FORTRAN, or other language
interfaces do not reflect a different order for values stored on the
disk, but merely different orders supported by the procedural
interfaces to the languages. In general, it does not matter whether a
netCDF dataset is written using the C, FORTRAN, or another language
interface; netCDF datasets written from any supported language may be
read by programs written in other supported languages.

3.2.5 More on General Array Section Access for Fortran
------------------------------------------------------

The use of mapped array sections allows non-trivial relationships
between the disk addresses of variable elements and the addresses where
they are stored in memory. For example, a matrix in memory could be the
transpose of that on disk, giving a quite different order of elements.
In a regular array section, the mapping between the disk and memory
addresses is trivial: the structure of the in-memory values (i.e., the
dimensional lengths and their order) is identical to that of the array
section. In a mapped array section, however, an index mapping vector is
used to define the mapping between indices of netCDF variable elements
and their memory addresses.

   With mapped array access, the offset (number of array elements) from
the origin of a memory-resident array to a particular point is given by
the inner product[1] of the index mapping vector with the point's
coordinate offset vector. A point's coordinate offset vector gives, for
each dimension, the offset from the origin of the containing array to
the point. In FORTRAN, the values of a point's coordinate offset vector
are one less than the corresponding values of the point's coordinate
vector, e.g., the array element A(3,5) has coordinate offset vector [2,
4].

   The index mapping vector for a regular array section would have--in
order from most rapidly varying dimension to most slowly--a constant 1,
the product of that value with the edge length of the most rapidly
varying dimension of the array section, then the product of that value
with the edge length of the next most rapidly varying dimension, and so
on. In a mapped array, however, the correspondence between netCDF
variable disk locations and memory locations can be different.

   A detailed example of mapped array access is presented in the
description of the interfaces for mapped array access. *Note
nf_put_varm_ type: (netcdf-f77)nf_put_varm_ type.

   Note that, although the netCDF abstraction allows the use of
subsampled or mapped array-section access there use is not required. If
you do not need these more general forms of access, you may ignore
these capabilities and use single value access or regular array section
access instead.

3.3 Type Conversion
===================

Each netCDF variable has an external type, specified when the variable
is first defined. This external type determines whether the data is
intended for text or numeric values, and if numeric, the range and
precision of numeric values.

   If the netCDF external type for a variable is char, only character
data representing text strings can be written to or read from the
variable. No automatic conversion of text data to a different
representation is supported.

   If the type is numeric, however, the netCDF library allows you to
access the variable data as a different type and provides automatic
conversion between the numeric data in memory and the data in the
netCDF variable. For example, if you write a program that deals with
all numeric data as double-precision floating point values, you can
read netCDF data into double-precision arrays without knowing or caring
what the external type of the netCDF variables are. On reading netCDF
data, integers of various sizes and single-precision floating-point
values will all be converted to double-precision, if you use the data
access interface for double-precision values. Of course, you can avoid
automatic numeric conversion by using the netCDF interface for a value
type that corresponds to the external data type of each netCDF
variable, where such value types exist.

   The automatic numeric conversions performed by netCDF are easy to
understand, because they behave just like assignment of data of one
type to a variable of a different type. For example, if you read
floating-point netCDF data as integers, the result is truncated towards
zero, just as it would be if you assigned a floating-point value to an
integer variable. Such truncation is an example of the loss of
precision that can occur in numeric conversions.

   Converting from one numeric type to another may result in an error if
the target type is not capable of representing the converted value. For
example, an integer may not be able to hold data stored externally as
an IEEE floating-point number. When accessing an array of values, a
range error is returned if one or more values are out of the range of
representable values, but other values are converted properly.

   Note that mere loss of precision in type conversion does not result
in an error. For example, if you read double precision values into an
integer, no error results unless the magnitude of the double precision
value exceeds the representable range of integers on your platform.
Similarly, if you read a large integer into a float incapable of
representing all the bits of the integer in its mantissa, this loss of
precision will not result in an error. If you want to avoid such
precision loss, check the external types of the variables you access to
make sure you use an internal type that has a compatible precision.

   Whether a range error occurs in writing a large floating-point value
near the boundary of representable values may be depend on the
platform. The largest floating-point value you can write to a netCDF
float variable is the largest floating-point number representable on
your system that is less than 2 to the 128th power. The largest double
precision value you can write to a double variable is the largest
double-precision number representable on your system that is less than
2 to the 1024th power.

3.4 Data Structures
===================

The only kind of data structure directly supported by the netCDF
classic (and 64-bit offset) abstraction is a collection of named arrays
with attached vector attributes. NetCDF is not particularly well-suited
for storing linked lists, trees, sparse matrices, ragged arrays or
other kinds of data structures requiring pointers.  However, the
dimension index associated with the unlimited dimension (also known as
the "record number") may be useful for storing linked lists on which
various data structures can be constructed.

   More generally, it is possible with netCDF to build various kinds of
data structures from sets of arrays by adopting various conventions
regarding the use of data in one array as pointers into another array.
The netCDF library won't provide much help or hindrance with
constructing such data structures, but netCDF provides the mechanisms
with which such conventions can be designed.

   The following netCDF classic example stores a ragged array
ragged_mat using an attribute row_index to name an associated index
variable giving the index of the start of each row. In this example,
the first row contains 12 elements, the second row contains 7 elements
(19 - 12), and so on.

             float   ragged_mat(max_elements);
                     ragged_mat:row_index = "row_start";
             int     row_start(max_rows);
     data:
             row_start   = 0, 12, 19, ...

   As another example, netCDF variables may be grouped within a netCDF
classic or 64-bit offset dataset by defining attributes that list the
names of the variables in each group, separated by a conventional
delimiter such as a space or comma. Using a naming convention for
attribute names for such groupings permits any number of named groups
of variables. A particular conventional attribute for each variable
might list the names of the groups of which it is a member. Use of
attributes, or variables that refer to other attributes or variables,
provides a flexible mechanism for representing some kinds of complex
structures in netCDF datasets.

4 File Structure and Performance
********************************

This chapter describes the file structure of a netCDF classic or 64-bit
offset dataset in enough detail to aid in understanding netCDF
performance issues.

   NetCDF is a data abstraction for array-oriented data access and a
software library that provides a concrete implementation of the
interfaces that support that abstraction. The implementation provides a
machine-independent format for representing arrays. Although the netCDF
file format is hidden below the interfaces, some understanding of the
current implementation and associated file structure may help to make
clear why some netCDF operations are more expensive than others.

   For a detailed description of the netCDF format, see *Note File
Structure and Performance::. Knowledge of the format is not needed for
reading and writing netCDF data or understanding most efficiency
issues. Programs that use only the documented interfaces and that make
no assumptions about the format will continue to work even if the
netCDF format is changed in the future, because any such change will be
made below the documented interfaces and will support earlier versions
of the netCDF file format.

4.1 Parts of a NetCDF Classic File
==================================

A netCDF classic or 64-bit offset dataset is stored as a single file
comprising two parts:
  1. a header, containing all the information about dimensions,
     attributes, and variables except for the variable data;

  2. a data part, comprising fixed-size data, containing the data for
     variables that don't have an unlimited dimension; and variable-size
     data, containing the data for variables that have an unlimited
     dimension.

   Both the header and data parts are represented in a
machine-independent form. This form is very similar to XDR (eXternal
Data Representation), extended to support efficient storage of arrays
of non-byte data.

   The header at the beginning of the file contains information about
the dimensions, variables, and attributes in the file, including their
names, types, and other characteristics. The information about each
variable includes the offset to the beginning of the variable's data
for fixed-size variables or the relative offset of other variables
within a record. The header also contains dimension lengths and
information needed to map multidimensional indices for each variable to
the appropriate offsets.

   By default, this header has little usable extra space; it is only as
large as it needs to be for the dimensions, variables, and attributes
(including all the attribute values) in the netCDF dataset, with a
small amount of extra space from rounding up to the nearest disk block
size. This has the advantage that netCDF files are compact, requiring
very little overhead to store the ancillary data that makes the
datasets self-describing. A disadvantage of this organization is that
any operation on a netCDF dataset that requires the header to grow (or,
less likely, to shrink), for example adding new dimensions or new
variables, requires moving the data by copying it. This expense is
incurred when the enddef function is called: nc_enddef in C (*note
nc_enddef: (netcdf-c)nc_enddef.), NF_ENDDEF in Fortran (*note
NF_ENDDEF: (netcdf-f77)NF_ENDDEF.), after a previous call to the redef
function: nc_redef in C (*note nc_redef: (netcdf-c)nc_redef.) or
NF_REDEF in Fortran (*note NF_REDEF: (netcdf-f77)NF_REDEF.). If you
create all necessary dimensions, variables, and attributes before
writing data, and avoid later additions and renamings of netCDF
components that require more space in the header part of the file, you
avoid the cost associated with later changing the header.

   Alternatively, you can use an alternative version of the enddef
function with two underbar characters instead of one to explicitly
reserve extra space in the file header when the file is created: in C
nc__enddef (*note nc__enddef: (netcdf-c)nc__enddef.), in Fortran
NF__ENDDEF (*note NF__ENDDEF: (netcdf-f77)NF__ENDDEF.), after a
previous call to the redef function.  This avoids the expense of moving
all the data later by reserving enough extra space in the header to
accommodate anticipated changes, such as the addition of new attributes
or the extension of existing string attributes to hold longer strings.

   When the size of the header is changed, data in the file is moved,
and the location of data values in the file changes. If another program
is reading the netCDF dataset during redefinition, its view of the file
will be based on old, probably incorrect indexes. If netCDF datasets
are shared across redefinition, some mechanism external to the netCDF
library must be provided that prevents access by readers during
redefinition, and causes the readers to call nc_sync/NF_SYNC before any
subsequent access.

   The fixed-size data part that follows the header contains all the
variable data for variables that do not employ an unlimited dimension.
The data for each variable is stored contiguously in this part of the
file. If there is no unlimited dimension, this is the last part of the
netCDF file.

   The record-data part that follows the fixed-size data consists of a
variable number of fixed-size records, each of which contains data for
all the record variables. The record data for each variable is stored
contiguously in each record.

   The order in which the variable data appears in each data section is
the same as the order in which the variables were defined, in
increasing numerical order by netCDF variable ID. This knowledge can
sometimes be used to enhance data access performance, since the best
data access is currently achieved by reading or writing the data in
sequential order.

   For more detail see *Note File Format::.

4.2 The Extended XDR Layer
==========================

XDR is a standard for describing and encoding data and a library of
functions for external data representation, allowing programmers to
encode data structures in a machine-independent way. Classic or 64-bit
offset NetCDF employs an extended form of XDR for representing
information in the header part and the data parts. This extended XDR is
used to write portable data that can be read on any other machine for
which the library has been implemented.

   The cost of using a canonical external representation for data varies
according to the type of data and whether the external form is the same
as the machine's native form for that type.

   For some data types on some machines, the time required to convert
data to and from external form can be significant. The worst case is
reading or writing large arrays of floating-point data on a machine
that does not use IEEE floating-point as its native representation.

4.3 Large File Support
======================

It is possible to write netCDF files that exceed 2 GiByte on platforms
that have "Large File Support" (LFS). Such files are
platform-independent to other LFS platforms, but trying to open them on
an older platform without LFS yields a "file too large" error.

   Without LFS, no files larger than 2 GiBytes can be used. The rest of
this section applies only to systems with LFS.

   The original binary format of netCDF (classic format) limits the size
of data files by using a signed 32-bit offset within its internal
structure. Files larger than 2 GiB can be created, with certain
limitations. *Note NetCDF Classic Format Limitations::.

   In version 3.6.0, netCDF included its first-ever variant of the
underlying data format.  The new format introduced in 3.6.0 uses 64-bit
file offsets in place of the 32-bit offsets. There are still some
limits on the sizes of variables, but the new format can create very
large datasets. *Note NetCDF 64 bit Offset Format Limitations::.

   The original data format (netCDF classic), is still the default data
format for the netCDF library.

   The following table summarizes the size limitations of various
permutations of LFS support, netCDF version, and data format. Note that
1 GiB = 2^30 bytes or about 1.07e+9 bytes, 1 EiB = 2^60 bytes or about
1.15e+18 bytes. Note also that all sizes are really 4 bytes less than
the ones given below. For example the maximum size of a fixed variable
in netCDF 3.6 classic format is really 2 GiB - 4 bytes.

Limit              No LFS      v3.5        v3.6/classicv3.6/64-bit
                                                       offset
Max File Size      2 GiB       8 EiB       8 EiB       8 EiB
Max Number of      0           1 (last)    1 (last)    2^32
Fixed Vars > 2                                         
GiB                                                    
Max Record Vars    0           1 (last)    1 (last)    2^32
w/ Rec Size > 2                                        
GiB                                                    
Max Size of        2 GiB       2 GiB       2 GiB       4 GiB
Fixed/Record Size                                      
of Record Var                                          
Max Record Size    2           4 GiB       8 EiB/nrecs 8 EiB/nrecs
                   GiB/nrecs                           

   For more information about the different file formats of netCDF
*Note Which Format::.

4.4 NetCDF 64-bit Offset Format Limitations
===========================================

Although the 64-bit offset format allows the creation of much larger
netCDF files than was possible with the classic format, there are still
some restrictions on the size of variables.

   It's important to note that without Large File Support (LFS) in the
operating system, it's impossible to create any file larger than 2
GiBytes.  Assuming an operating system with LFS, the following
restrictions apply to the netCDF 64-bit offset format.

   No fixed-size variable can require more than 2^32 - 4 bytes (i.e.
4GiB - 4 bytes, or 4,294,967,292 bytes) of storage for its data, unless
it is the last fixed-size variable and there are no record variables.
When there are no record variables, the last fixed-size variable can be
any size supported by the file system, e.g. terabytes.

   A 64-bit offset format netCDF file can have up to 2^32 - 1 fixed
sized variables, each under 4GiB in size. If there are no record
variables in the file the last fixed variable can be any size.

   No record variable can require more than 2^32 - 4 bytes of storage
for each record's worth of data, unless it is the last record variable.
A 64-bit offset format netCDF file can have up to 2^32 - 1 records, of
up to 2^32 - 1 variables, as long as the size of one record's data for
each record variable except the last is less than 4 GiB - 4.

   Note also that all netCDF variables and records are padded to 4 byte
boundaries.

4.5 NetCDF Classic Format Limitations
=====================================

There are important constraints on the structure of large netCDF
classic files that result from the 32-bit relative offsets that are
part of the netCDF classic file format:

   The maximum size of a record in the classic format in versions 3.5.1
and earlier is 2^32 - 4 bytes, or about 4 GiB.  In versions 3.6.0 and
later, there is no such restriction on total record size for the classic
format or 64-bit offset format.

   The maximum size of a record in the classic format in versions 3.5.1
and earlier is 2^32 - 4 bytes, or about 4 GiB.  In versions 3.6.0 and
later, there is no such restriction on total record size for the classic
format or 64-bit offset format.

   If you don't use the unlimited dimension, only one variable can
exceed 2 GiB in size, but it can be as large as the underlying file
system permits. It must be the last variable in the dataset, and the
offset to the beginning of this variable must be less than about 2 GiB.

   The limit is really 2^31 - 4.  If you were to specify a variable
size of 2^31 - 3, for example, it would be rounded up to the nearest
multiple of 4 bytes, which would be 2^31, which is larger than the
largest signed integer, 2^31 - 1.

   For example, the structure of the data might be something like:

     netcdf bigfile1 {
         dimensions:
            x=2000;
            y=5000;
            z=10000;
         variables:
            double x(x);         // coordinate variables
            double y(y);
            double z(z);
            double var(x, y, z); // 800 Gbytes
         }

   If you use the unlimited dimension, record variables may exceed 2
GiB in size, as long as the offset of the start of each record variable
within a record is less than 2 GiB - 4. For example, the structure of
the data in a 2.4 Tbyte file might be something like:

     netcdf bigfile2 {
         dimensions:
            x=2000;
            y=5000;
            z=10;
            t=UNLIMITED;         // 1000 records, for example
         variables:
            double x(x);         // coordinate variables
            double y(y);
            double z(z);
            double t(t);
                                 // 3 record variables, 2400000000 bytes per record
            double var1(t, x, y, z);
            double var2(t, x, y, z);
            double var3(t, x, y, z);
         }

4.6 The NetCDF-3 I/O Layer
==========================

For netCDF classic and 64-bit offset files, an I/O layer implemented
much like the C standard I/O (stdio) library is used by netCDF to read
and write portable data to netCDF datasets. Hence an understanding of
the standard I/O library provides answers to many questions about
multiple processes accessing data concurrently, the use of I/O buffers,
and the costs of opening and closing netCDF files. In particular, it is
possible to have one process writing a netCDF dataset while other
processes read it.

   Data reads and writes are no more atomic than calls to stdio fread()
and fwrite(). An nc_sync/NF_SYNC call is analogous to the fflush call
in the C standard I/O library, writing unwritten buffered data so other
processes can read it; The C function nc_sync (*note nc_sync:
(netcdf-c)nc_sync.), or the Fortran function NF_SYNC (*note NF_SYNC:
(netcdf-f77)NF_SYNC.), also brings header changes up-to-date (for
example, changes to attribute values). Opening the file with the
NC_SHARE (in C) or the NF_SHARE (in Fortran) is analogous to setting a
stdio stream to be unbuffered with the _IONBF flag to setvbuf.

   As in the stdio library, flushes are also performed when "seeks"
occur to a different area of the file. Hence the order of read and write
operations can influence I/O performance significantly. Reading data in
the same order in which it was written within each record will minimize
buffer flushes.

   You should not expect netCDF classic or 64-bit offset format data
access to work with multiple writers having the same file open for
writing simultaneously.

   It is possible to tune an implementation of netCDF for some platforms
by replacing the I/O layer with a different platform-specific I/O
layer. This may change the similarities between netCDF and standard
I/O, and hence characteristics related to data sharing, buffering, and
the cost of I/O operations.

   The distributed netCDF implementation is meant to be portable.
Platform-specific ports that further optimize the implementation for
better I/O performance are practical in some cases.

4.7 UNICOS Optimization
=======================

It should be noted that no UNICOS platform has been available at
Unidata for netCDF testing for some years. The following information is
left here for historical reasons.

   As was mentioned in the previous section, it is possible to replace
the I/O layer in order to increase I/O efficiency. This has been done
for UNICOS, the operating system of Cray computers similar to the Cray
Y-MP.

   Additionally, it is possible for the user to obtain even greater I/O
efficiency through appropriate setting of the NETCDF_FFIOSPEC
environment variable. This variable specifies the Flexible File I/O
buffers for netCDF I/O when executing under the UNICOS operating system
(the variable is ignored on other operating systems). An appropriate
specification can greatly increase the efficiency of netCDF I/O--to the
extent that it can surpass default FORTRAN binary I/O. Possible
specifications include the following:

`bufa:336:2'
     2, asynchronous, I/O buffers of 336 blocks each (i.e., double
     buffering). This is the default specification and favors sequential
     I/O.

`cache:256:8'
     8, synchronous, 256-block buffers. This favors larger random
     accesses.

`cachea:256:8:2'
     8, asynchronous, 256-block buffers with a 2 block
     read-ahead/write-behind factor. This also favors larger random
     accesses.

`cachea:8:256:0'
     256, asynchronous, 8-block buffers without
     read-ahead/write-behind. This favors many smaller pages without
     read-ahead for more random accesses as typified by slicing netCDF
     arrays.

`cache:8:256,cachea.sds:1024:4:1'
     This is a two layer cache. The first (synchronous) layer is
     composed of 256 8-block buffers in memory, the second
     (asynchronous) layer is composed of 4 1024-block buffers on the
     SSD. This scheme works well when accesses proceed through the
     dataset in random waves roughly 2x1024-blocks wide.


   All of the options/configurations supported in CRI's FFIO library are
available through this mechanism. We recommend that you look at CRI's
I/O optimization guide for information on using FFIO to its fullest.
This mechanism is also compatible with CRI's EIE I/O library.

   Tuning the NETCDF_FFIOSPEC variable to a program's I/O pattern can
dramatically improve performance. Speedups of two orders of magnitude
have been seen.

5 NetCDF Utilities
******************

One of the primary reasons for using the netCDF interface for
applications that deal with arrays is to take advantage of higher-level
netCDF utilities and generic applications for netCDF data. Currently
two netCDF utilities are available as part of the netCDF software
distribution:

`ncdump'
     reads a netCDF dataset and prints a textual representation of the
     information in the dataset

`ncgen'
     reads a textual representation of a netCDF dataset and generates
     the corresponding binary netCDF file or a C or FORTRAN program to
     create the netCDF dataset


   Users have contributed other netCDF utilities, and various
visualization and analysis packages are available that access netCDF
data. For an up-to-date list of freely-available and commercial
software that can access or manipulate netCDF data, see the NetCDF
Software list,
`http://www.unidata.ucar.edu/software/netcdf/software.html'.

   This chapter describes the ncgen and ncdump utilities. These two
tools convert between binary netCDF datasets and a text representation
of netCDF datasets. The output of ncdump and the input to ncgen is a
text description of a netCDF dataset in a tiny language known as CDL
(network Common data form Description Language).

5.1 CDL Syntax
==============

Below is an example of CDL, describing a netCDF dataset with several
named dimensions (lat, lon, time), variables (z, t, p, rh, lat, lon,
time), variable attributes (units, _FillValue, valid_range), and some
data.

     netcdf foo {    // example netCDF specification in CDL

     dimensions:
     lat = 10, lon = 5, time = unlimited;

     variables:
       int     lat(lat), lon(lon), time(time);
       float   z(time,lat,lon), t(time,lat,lon);
       double  p(time,lat,lon);
       int     rh(time,lat,lon);

       lat:units = "degrees_north";
       lon:units = "degrees_east";
       time:units = "seconds";
       z:units = "meters";
       z:valid_range = 0., 5000.;
       p:_FillValue = -9999.;
       rh:_FillValue = -1;

     data:
       lat   = 0, 10, 20, 30, 40, 50, 60, 70, 80, 90;
       lon   = -140, -118, -96, -84, -52;
     }

   All CDL statements are terminated by a semicolon. Spaces, tabs, and
newlines can be used freely for readability. Comments may follow the
double slash characters // on any line.

   A CDL description consists of three optional parts: dimensions,
variables, and data. The variable part may contain variable
declarations and attribute assignments.

   A dimension is used to define the shape of one or more of the
multidimensional variables described by the CDL description. A
dimension has a name and a length. At most one dimension in a CDL
description can have the unlimited length, which means a variable using
this dimension can grow to any length (like a record number in a file).

   A variable represents a multidimensional array of values of the same
type. A variable has a name, a data type, and a shape described by its
list of dimensions. Each variable may also have associated attributes
(see below) as well as data values. The name, data type, and shape of a
variable are specified by its declaration in the variable section of a
CDL description. A variable may have the same name as a dimension; by
convention such a variable contains coordinates of the dimension it
names.

   An attribute contains information about a variable or about the whole
netCDF dataset. Attributes may be used to specify such properties as
units, special values, maximum and minimum valid values, and packing
parameters. Attribute information is represented by single values or
arrays of values. For example, units is an attribute represented by a
character array such as celsius. An attribute has an associated
variable, a name, a data type, a length, and a value. In contrast to
variables that are intended for data, attributes are intended for
ancillary data (data about data).

   In CDL, an attribute is designated by a variable and attribute name,
separated by a colon (':'). It is possible to assign global attributes
to the netCDF dataset as a whole by omitting the variable name and
beginning the attribute name with a colon (':'). The data type of an
attribute in CDL is derived from the type of the value assigned to it.
The length of an attribute is the number of data values or the number
of characters in the character string assigned to it. Multiple values
are assigned to non-character attributes by separating the values with
commas (','). All values assigned to an attribute must be of the same
type.

   CDL names for variables, attributes, and dimensions should be a
legal identifier.  A legal identifier is any sequence of one or more
alphabetic characters, digits, and the following special characters:
'_', '.', '-', '@', and '+'. The identifier must, however, start with an
alphabetic character or underscore. Case is significant and names
commencing with underscore are reserved for system use.

   Note specifically that the extended ISO8859 character set
(characters in the range 0x80-0xFF) are specifically excluded because
they conflict with the UTF8 format (if that is enabled).

   The names for the primitive data types are reserved words in CDL, so
the names of variables, dimensions, and attributes must not be type
names.  The optional data section of a CDL description is where netCDF
variables may be initialized. The syntax of an initialization is simple:

     variable = value_1, value_2, ...;

   The comma-delimited list of constants may be separated by spaces,
tabs, and newlines. For multidimensional arrays, the last dimension
varies fastest. Thus, row-order rather than column order is used for
matrices. If fewer values are supplied than are needed to fill a
variable, it is extended with the fill value. The types of constants
need not match the type declared for a variable; coercions are done to
convert integers to floating point, for example. All meaningful type
conversions are supported.

   A special notation for fill values is supported: the _ character
designates a fill value for variables.

5.2 CDL Data Types
==================

The CDL data types are:

`char'
     Characters.

`byte'
     Eight-bit integers.

`short'
     16-bit signed integers.

`int'
     32-bit signed integers.

`long'
     (Deprecated, currently synonymous with int)

`float'
     IEEE single-precision floating point (32 bits).

`real'
     (Synonymous with float).

`double'
     IEEE double-precision floating point (64 bits).

   Except for the added data-type byte and the lack of the type
qualifier unsigned, CDL supports the same primitive data types as C. In
declarations, type names may be specified in either upper or lower case.

   The byte type differs from the char type in that it is intended for
eight-bit data, and the zero byte has no special significance, as it
may for character data. The ncgen utility converts byte declarations to
char declarations in the output C code and to BYTE, INTEGER*1, or
similar platform-specific declaration in output FORTRAN code.

   The short type holds values between -32768 and 32767. The ncgen
utility converts short declarations to short declarations in the output
C code and to INTEGER*2 declaration in output FORTRAN code.

   The ushort type is an unsigned short type. It holds values between 0
and 65536.

   The int type can hold values between -2147483648 and 2147483647. The
ncgen utility converts int declarations to int declarations in the
output C code and to INTEGER declarations in output FORTRAN code. In
CDL declarations integer and long are accepted as synonyms for int.

   The uint type is an unsigned int type. It holds values between 0 and
4294967296.

   The int64 is an 8-byte signed integer. It can hold values between
-9223372036854775808 and 9223372036854775807.

   The uint64 is an unsigned 8-byte integer type. It can hold values
between 0 and 18446744073709551616.

   The float type can hold values between about -3.4+38 and 3.4+38, with
external representation as 32-bit IEEE normalized single-precision
floating-point numbers. The ncgen utility converts float declarations
to float declarations in the output C code and to REAL declarations in
output FORTRAN code. In CDL declarations real is accepted as a synonym
for float.

   The double type can hold values between about -1.7+308 and 1.7+308,
with external representation as 64-bit IEEE standard normalized
double-precision, floating-point numbers. The ncgen utility converts
double declarations to double declarations in the output C code and to
DOUBLE PRECISION declarations in output FORTRAN code.

   The string type holds variable length strings.

   The bool type holds boolean values.

5.3 CDL Notation for Data Constants
===================================

This section describes the CDL notation for constants.

   Attributes are initialized in the variables section of a CDL
description by providing a list of constants that determines the
attribute's type and length. (In the C and FORTRAN procedural
interfaces to the netCDF library, the type and length of an attribute
must be explicitly provided when it is defined.) CDL defines a syntax
for constant values that permits distinguishing among different netCDF
types. The syntax for CDL constants is similar to C syntax, except that
type suffixes are appended to shorts and floats to distinguish them
from ints and doubles.

   A byte constant is represented by a single character or multiple
character escape sequence enclosed in single quotes. For example:

     'a'     // ASCII a
     '\0'    // a zero byte
     '\n'    // ASCII newline character
     '\33'   // ASCII escape character (33 octal)
     '\x2b'  // ASCII plus (2b hex)
     '\376'  // 377 octal = -127 (or 254) decimal

   Character constants are enclosed in double quotes. A character array
may be represented as a string enclosed in double quotes. Multiple
strings are concatenated into a single array of characters, permitting
long character arrays to appear on multiple lines. To support multiple
variable-length string values, a conventional delimiter such as ',' may
be used, but interpretation of any such convention for a string
delimiter must be implemented in software above the netCDF library
layer. The usual escape conventions for C strings are honored. For
example:

     "a"            // ASCII 'a'
     "Two\nlines\n" // a 10-character string with two embedded newlines
     "a bell:\007"  // a string containing an ASCII bell
     "ab","cde"     // the same as "abcde"

   The form of a short constant is an integer constant with an 's' or
'S' appended. If a short constant begins with '0', it is interpreted as
octal. When it begins with '0x', it is interpreted as a hexadecimal
constant. For example:

     2s      // a short 2
     0123s   // octal
     0x7ffs  // hexadecimal

   The form of an int constant is an ordinary integer constant. If an
int constant begins with '0', it is interpreted as octal. When it begins
with '0x', it is interpreted as a hexadecimal constant. Examples of
valid int constants include:

     -2
     0123            // octal
     0x7ff           // hexadecimal
     1234567890L     // deprecated, uses old long suffix

   The float type is appropriate for representing data with about seven
significant digits of precision. The form of a float constant is the
same as a C floating-point constant with an 'f' or 'F' appended. A
decimal point is required in a CDL float to distinguish it from an
integer. For example, the following are all acceptable float constants:

     -2.0f
     3.14159265358979f       // will be truncated to less precision
     1.f
     .1f

   The double type is appropriate for representing floating-point data
with about 16 significant digits of precision. The form of a double
constant is the same as a C floating-point constant. An optional 'd' or
'D' may be appended. A decimal point is required in a CDL double to
distinguish it from an integer. For example, the following are all
acceptable double constants:

     -2.0
     3.141592653589793
     1.0e-20
     1.d

5.4 ncgen
=========

The ncgen tool generates a netCDF file or a C or FORTRAN program that
creates a netCDF dataset. If no options are specified in invoking
ncgen, the program merely checks the syntax of the CDL input, producing
error messages for any violations of CDL syntax.

   UNIX syntax for invoking ncgen:

     ncgen [-b] [-o netcdf-file] [-c] [-f] [-k kind] [-x] [input-file]

   where:

`-b'
     Create a (binary) netCDF file. If the '-o' option is absent, a
     default file name will be constructed from the netCDF name
     (specified after the netcdf keyword in the input) by appending the
     '.nc' extension. Warning: if a file already exists with the
     specified name it will be overwritten.

`-o netcdf-file'
     Name for the netCDF file created. If this option is specified, it
     implies the '-b' option. (This option is necessary because netCDF
     files are direct-access files created with seek calls, and hence
     cannot be written to standard output.)

`-c'
     Generate C source code that will create a netCDF dataset matching
     the netCDF specification. The C source code is written to standard
     output. This is only useful for relatively small CDL files, since
     all the data is included in variable initializations in the
     generated program.

`-f'
     Generate FORTRAN source code that will create a netCDF dataset
     matching the netCDF specification. The FORTRAN source code is
     written to standard output. This is only useful for relatively
     small CDL files, since all the data is included in variable
     initializations in the generated program.

`-k kind'
     The generated netCDF file or program will be of the specified kind,
     one of "classic", "64-bit-offset", "hdf5", or "hdf5-nc3".  The
     default kind is "classic", the original variant of the netCDF file
     format with 32-bit offsets, limiting file sizes in most cases to 2
     GiB or less.  The kind "64-bit-offset" specifies the second
     version of the file format with 64-bit offsets, to allow for the
     creation of very large files.  These files are not as portable as
     classic format netCDF files, because they require version 3.6.0 or
     later of the netCDF library.  Specifying "hdf5" will produce a
     file using the netCDF-4 programming interfaces (not yet
     implemented), and is required if the CDL input includes features
     of the netCDF-4 data model, such as Groups or compound types.
     Specifying "hdf5-nc3" will create an HDF5 file using the netCDF-4
     interfaces (not yet implemented), but restricted to the netCDF-3
     data model, so that the file can be accessed and manipulated by
     unmodified netCDF-2 or netCDF-3 programs and utilities that have
     been relinked with the netCDF-4 library.  These four kinds of
     files may also be specified numerically using "1", "2", "3", or
     "4", respectively.  For backward comapatibility, `-v kind' is also
     accepted.

`-x'
     Use "no fill" mode, omitting the initialization of variable values
     with fill values.  This can make the creation of large files much
     faster, but it will also eliminate the possibility of detecting the
     inadvertent reading of values that haven't been written.

Examples
========

Check the syntax of the CDL file foo.cdl:

     ncgen foo.cdl

   From the CDL file foo.cdl, generate an equivalent binary netCDF file
named bar.nc:

     ncgen -o bar.nc foo.cdl

   From the CDL file foo.cdl, generate a C program containing netCDF
function invocations that will create an equivalent binary netCDF
dataset:

     ncgen -c foo.cdl > foo.c

5.5 ncdump
==========

The ncdump tool generates the CDL text representation of a netCDF
dataset on standard output, optionally excluding some or all of the
variable data in the output. The output from ncdump is intended to be
acceptable as input to ncgen. Thus ncdump and ncgen can be used as
inverses to transform data representation between binary and text
representations.

   ncdump may also be used as a simple browser for netCDF datasets, to
display the dimension names and lengths; variable names, types, and
shapes; attribute names and values; and optionally, the values of data
for all variables or selected variables in a netCDF dataset.  Another
use for ncdump is to determine what kind of netCDF file is used (which
variant of the netCDF file format).

   ncdump defines a default format used for each type of netCDF variable
data, but this can be overridden if a C_format attribute is defined for
a netCDF variable. In this case, ncdump will use the C_format attribute
to format values for that variable. For example, if floating-point data
for the netCDF variable Z is known to be accurate to only three
significant digits, it might be appropriate to use this variable
attribute:

     Z:C_format = "%.3g"

   ncdump uses '_' to represent data values that are equal to the
_FillValue attribute for a variable, intended to represent data that
has not yet been written. If a variable has no _FillValue attribute,
the default fill value for the variable type is used unless the
variable is of byte type.

   UNIX syntax for invoking ncdump:

     ncdump  [ -c | -h]  [-v var1,...]  [-b lang]  [-f lang]
     [-l len]  [ -p fdig[,ddig]]  [ -n name] [-k]  [input-file]

   where:

`-c'
     Show the values of coordinate variables (variables that are also
     dimensions) as well as the declarations of all dimensions,
     variables, and attribute values. Data values of non-coordinate
     variables are not included in the output. This is often the most
     suitable option to use for a brief look at the structure and
     contents of a netCDF dataset.

`-h'
     Show only the header information in the output, that is, output
     only the declarations for the netCDF dimensions, variables, and
     attributes of the input file, but no data values for any
     variables. The output is identical to using the '-c' option except
     that the values of coordinate variables are not included. (At most
     one of '-c' or '-h' options may be present.)

`-v var1,...'
     The output will include data values for the specified variables, in
     addition to the declarations of all dimensions, variables, and
     attributes. One or more variables must be specified by name in the
     comma-delimited list following this option. The list must be a
     single argument to the command, hence cannot contain blanks or
     other white space characters. The named variables must be valid
     netCDF variables in the input-file. The default, without this
     option and in the absence of the '-c' or '-h' options, is to
     include data values for all variables in the output.

`-b lang'
     A brief annotation in the form of a CDL comment (text beginning
     with the characters '//') will be included in the data section of
     the output for each 'row' of data, to help identify data values for
     multidimensional variables. If lang begins with 'C' or 'c', then C
     language conventions will be used (zero-based indices, last
     dimension varying fastest). If lang begins with 'F' or 'f', then
     FORTRAN language conventions will be used (one-based indices,
     first dimension varying fastest). In either case, the data will be
     presented in the same order; only the annotations will differ.
     This option may be useful for browsing through large volumes of
     multidimensional data.

`-f lang'
     Full annotations in the form of trailing CDL comments (text
     beginning with the characters '//') for every data value (except
     individual characters in character arrays) will be included in the
     data section. If lang begins with 'C' or 'c', then C language
     conventions will be used (zero-based indices, last dimension
     varying fastest). If lang begins with 'F' or 'f', then FORTRAN
     language conventions will be used (one-based indices, first
     dimension varying fastest). In either case, the data will be
     presented in the same order; only the annotations will differ.
     This option may be useful for piping data into other filters,
     since each data value appears on a separate line, fully
     identified. (At most one of '-b' or '-f' options may be present.)

`-l len'
     Changes the default maximum line length (80) used in formatting
     lists of non-character data values.

`-p float_digits[,double_digits]'
     Specifies default precision (number of significant digits) to use
     in displaying floating-point or double precision data values for
     attributes and variables. If specified, this value overrides the
     value of the C_format attribute, if any, for a variable.
     Floating-point data will be displayed with float_digits
     significant digits. If double_digits is also specified,
     double-precision values will be displayed with that many
     significant digits. In the absence of any '-p' specifications,
     floating-point and double-precision data are displayed with 7 and
     15 significant digits respectively. CDL files can be made smaller
     if less precision is required. If both floating-point and double
     precisions are specified, the two values must appear separated by
     a comma (no blanks) as a single argument to the command.

`-n name'
     CDL requires a name for a netCDF dataset, for use by 'ncgen -b' in
     generating a default netCDF dataset name. By default, ncdump
     constructs this name from the last component of the file name of
     the input netCDF dataset by stripping off any extension it has.
     Use the '-n' option to specify a different name. Although the
     output file name used by 'ncgen -b' can be specified, it may be
     wise to have ncdump change the default name to avoid inadvertently
     overwriting a valuable netCDF dataset when using ncdump, editing
     the resulting CDL file, and using 'ncgen -b' to generate a new
     netCDF dataset from the edited CDL file.

`-k'
     Show what kind of netCDF file input-file is, one of 'classic',
     '64-bit-offset','hdf5', or 'hdf5-nc3'.  Before version 3.6, there
     was only one kind of netCDF file, designated as 'classic' (also
     know as format variant 1).  Large file support introduced another
     variant of the format, designated as '64-bit-offset' (known as
     format variant 2).  NetCDF-4, uses a third variant of the format,
     'hdf5' (format variant 3).  Another format variant, designated
     'hdf5-classic' (format variant 4), is restricted to features
     supported by the netCDF-3 data model but represented using the
     HDF5 format, so that an unmodified netCDF-3 program can read or
     write the file just by relinking with the netCDF-4 library.  The
     string output by using the '-k' option may be provided as the
     value of the '-k' option to ncgen(1) to specify exactly what kind
     of netCDF file to generate, when you want to override the default
     inferred from the CDL.


Examples
========

Look at the structure of the data in the netCDF dataset foo.nc:

   ncdump -c foo.nc

   Produce an annotated CDL version of the structure and data in the
netCDF dataset foo.nc, using C-style indexing for the annotations:

   ncdump -b c foo.nc > foo.cdl

   Output data for only the variables uwind and vwind from the netCDF
dataset foo.nc, and show the floating-point data with only three
significant digits of precision:

   ncdump -v uwind,vwind -p 3 foo.nc

   Produce a fully-annotated (one data value per line) listing of the
data for the variable omega, using FORTRAN conventions for indices, and
changing the netCDF dataset name in the resulting CDL file to omega:

   ncdump -v omega -f fortran -n omega foo.nc > Z.cdl

Appendix A Units
****************

The Unidata Program Center has developed a units library to convert
between formatted and binary forms of units specifications and perform
unit algebra on the binary form. Though the units library is
self-contained and there is no dependency between it and the netCDF
library, it is nevertheless useful in writing generic netCDF programs
and we suggest you obtain it. The library and associated documentation
is available from `http://www.unidata.ucar.edu/software/udunits/'.

   The following are examples of units strings that can be interpreted
by the utScan() function of the Unidata units library:

     10 kilogram.meters/seconds2
     10 kg-m/sec2
     10 kg m/s^2
     10 kilogram meter second-2
     (PI radian)2
     degF
     100rpm
     geopotential meters
     33 feet water
     milliseconds since 1992-12-31 12:34:0.1 -7:00

   A unit is specified as an arbitrary product of constants and
unit-names raised to arbitrary integral powers. Division is indicated
by a slash '/'. Multiplication is indicated by white space, a period
'.', or a hyphen '-'. Exponentiation is indicated by an integer suffix
or by the exponentiation operators '^' and '**'. Parentheses may be
used for grouping and disambiguation. The time stamp in the last
example is handled as a special case.

   Arbitrary Galilean transformations (i.e., y = ax + b) are allowed. In
particular, temperature conversions are correctly handled. The
specification:

     degF  32

   indicates a Fahrenheit scale with the origin shifted to thirty-two
degrees Fahrenheit (i.e., to zero Celsius). Thus, the Celsius scale is
equivalent to the following unit:

     1.8 degF  32

   Note that the origin-shift operation takes precedence over
multiplication. In order of increasing precedence, the operations are
division, multiplication, origin-shift, and exponentiation.

   utScan() understands all the SI prefixes (e.g. "mega" and "milli")
plus their abbreviations (e.g. "M" and "m")

   The function utPrint() always encodes a unit specification one way.
To reduce misunderstandings, it is recommended that this encoding style
be used as the default. In general, a unit is encoded in terms of basic
units, factors, and exponents. Basic units are separated by spaces, and
any exponent directly appends its associated unit. The above examples
would be encoded as follows:

     10 kilogram meter second-2
     9.8696044 radian2
     0.555556 kelvin  255.372
     10.471976 radian second-1
     9.80665 meter2 second-2
     98636.5 kilogram meter-1 second-2
     0.001 seconds since 1992-12-31 19:34:0.1000 UTC

   (Note that the Fahrenheit unit is encoded as a deviation, in
fractional kelvins, from an origin at 255.372 kelvin, and that the time
in the last example has been referenced to UTC.)

   The database for the units library is a formatted file containing
unit definitions and is used to initialize this package. It is the first
place to look to discover the set of valid names and symbols.

   The format for the units-file is documented internally and the file
may be modified by the user as necessary. In particular, additional
units and constants may be easily added (including variant spellings of
existing units or constants).

   utScan() is case-sensitive. If this causes difficulties, you might
try making appropriate additional entries to the units-file.

   Some unit abbreviations in the default units-file might seem
counterintuitive. In particular, note the following:

For                Use                Not                Which Instead
                                                         Means
Celsius            Celsius            C                  coulomb
gram               gram               g                  <standard free
                                                         fall>
gallon             gallon             gal                <acceleration>
radian             radian             rad                <absorbed dose>
Newton             newton or N        nt                 nit (unit of
                                                         photometry)

   For additional information on this units library, please consult the
manual pages that come with its distribution.

Appendix B Attribute Conventions
********************************

Names commencing with underscore ('_') are reserved for use by the
netCDF library. Most generic applications that process netCDF datasets
assume standard attribute conventions and it is strongly recommended
that these be followed unless there are good reasons for not doing so.
Below we list the names and meanings of recommended standard attributes
that have proven useful. Note that some of these (e.g. units,
valid_range, scale_factor) assume numeric data and should not be used
with character data.

`units'
     A character string that specifies the units used for the variable's
     data. Unidata has developed a freely-available library of routines
     to convert between character string and binary forms of unit
     specifications and to perform various useful operations on the
     binary forms. This library is used in some netCDF applications.
     Using the recommended units syntax permits data represented in
     conformable units to be automatically converted to common units
     for arithmetic operations. For more information see *Note Units::.

`long_name'
     A long descriptive name. This could be used for labeling plots, for
     example. If a variable has no long_name attribute assigned, the
     variable name should be used as a default.

`valid_min'
     A scalar specifying the minimum valid value for this variable.

`valid_max'
     A scalar specifying the maximum valid value for this variable.

`valid_range'
     A vector of two numbers specifying the minimum and maximum valid
     values for this variable, equivalent to specifying values for both
     valid_min and valid_max attributes. Any of these attributes define
     the valid range. The attribute valid_range must not be defined if
     either valid_min or valid_max is defined.

     Generic applications should treat values outside the valid range as
     missing. The type of each valid_range, valid_min and valid_max
     attribute should match the type of its variable (except that for
     byte data, these can be of a signed integral type to specify the
     intended range).

     If neither valid_min, valid_max nor valid_range is defined then
     generic applications should define a valid range as follows. If the
     data type is byte and _FillValue is not explicitly defined, then
     the valid range should include all possible values. Otherwise, the
     valid range should exclude the _FillValue (whether defined
     explicitly or by default) as follows. If the _FillValue is
     positive then it defines a valid maximum, otherwise it defines a
     valid minimum. For integer types, there should be a difference of
     1 between the _FillValue and this valid minimum or maximum. For
     floating point types, the difference should be twice the minimum
     possible (1 in the least significant bit) to allow for rounding
     error.

`scale_factor'
     If present for a variable, the data are to be multiplied by this
     factor after the data are read by the application that accesses the
     data.

`add_offset'
     If present for a variable, this number is to be added to the data
     after it is read by the application that accesses the data. If both
     scale_factor and add_offset attributes are present, the data are
     first scaled before the offset is added. The attributes
     scale_factor and add_offset can be used together to provide simple
     data compression to store low-resolution floating-point data as
     small integers in a netCDF dataset. When scaled data are written,
     the application should first subtract the offset and then divide
     by the scale factor, rounding the result to the nearest integer to
     avoid a bias caused by truncation towards zero.

     When scale_factor and add_offset are used for packing, the
     associated variable (containing the packed data) is typically of
     type byte or short, whereas the unpacked values are intended to be
     of type float or double. The attributes scale_factor and
     add_offset should both be of the type intended for the unpacked
     data, e.g. float or double.

`_FillValue'
     The _FillValue attribute specifies the fill value used to pre-fill
     disk space allocated to the variable. Such pre-fill occurs unless
     nofill mode is set using nc_set_fill in C (*note nc_set_fill:
     (netcdf-c)nc_set_fill.) or NF_SET_FILL in Fortran (*note
     NF_SET_FILL: (netcdf-f77)NF_SET_FILL.). The fill value is returned
     when reading values that were never written. If _FillValue is
     defined then it should be scalar and of the same type as the
     variable. It is not necessary to define your own _FillValue
     attribute for a variable if the default fill value for the type of
     the variable is adequate. However, use of the default fill value
     for data type byte is not recommended. Note that if you change the
     value of this attribute, the changed value applies only to
     subsequent writes; previously written data are not changed.

     Generic applications often need to write a value to represent
     undefined or missing values. The fill value provides an appropriate
     value for this purpose because it is normally outside the valid
     range and therefore treated as missing when read by generic
     applications. It is legal (but not recommended) for the fill value
     to be within the valid range.

     For more information for C programmers see *Note Fill Values:
     (netcdf-c)Fill Values. For more information for Fortran
     programmers see *Note Fill Values: (netcdf-f77)Fill Values.

`missing_value'
     This attribute is not treated in any special way by the library or
     conforming generic applications, but is often useful documentation
     and may be used by specific applications. The missing_value
     attribute can be a scalar or vector containing values indicating
     missing data. These values should all be outside the valid range
     so that generic applications will treat them as missing.

`signedness'
     Deprecated attribute, originally designed to indicate whether byte
     values should be treated as signed or unsigned. The attributes
     valid_min and valid_max may be used for this purpose. For example,
     if you intend that a byte variable store only nonnegative values,
     you can use valid_min = 0 and valid_max = 255. This attribute is
     ignored by the netCDF library.

`C_format'
     A character array providing the format that should be used by C
     applications to print values for this variable. For example, if you
     know a variable is only accurate to three significant digits, it
     would be appropriate to define the C_format attribute as "%.3g".
     The ncdump utility program uses this attribute for variables for
     which it is defined. The format applies to the scaled (internal)
     type and value, regardless of the presence of the scaling
     attributes scale_factor and add_offset.

`FORTRAN_format'
     A character array providing the format that should be used by
     FORTRAN applications to print values for this variable. For
     example, if you know a variable is only accurate to three
     significant digits, it would be appropriate to define the
     FORTRAN_format attribute as "(G10.3)".

`title'
     A global attribute that is a character array providing a succinct
     description of what is in the dataset.

`history'
     A global attribute for an audit trail. This is a character array
     with a line for each invocation of a program that has modified the
     dataset. Well-behaved generic netCDF applications should append a
     line containing: date, time of day, user name, program name and
     command arguments.

`Conventions'
     If present, 'Conventions' is a global attribute that is a character
     array for the name of the conventions followed by the dataset, in
     the form of a string that is interpreted as a directory name
     relative to a directory that is a repository of documents
     describing sets of discipline-specific conventions. This permits a
     hierarchical structure for conventions and provides a place where
     descriptions and examples of the conventions may be maintained by
     the defining institutions and groups. The conventions directory
     name is currently interpreted relative to the directory
     pub/netcdf/Conventions/ on the host machine ftp.unidata.ucar.edu.
     Alternatively, a full URL specification may be used to name a WWW
     site where documents that describe the conventions are maintained.

     For example, if a group named NUWG agrees upon a set of conventions
     for dimension names, variable names, required attributes, and
     netCDF representations for certain discipline-specific data
     structures, they may store a document describing the agreed-upon
     conventions in a dataset in the NUWG/ subdirectory of the
     Conventions directory. Datasets that followed these conventions
     would contain a global Conventions attribute with value "NUWG".

     Later, if the group agrees upon some additional conventions for a
     specific subset of NUWG data, for example time series data, the
     description of the additional conventions might be stored in the
     NUWG/Time_series/ subdirectory, and datasets that adhered to these
     additional conventions would use the global Conventions attribute
     with value "NUWG/Time_series", implying that this dataset adheres
     to the NUWG conventions and also to the additional NUWG time-series
     conventions.

Appendix C File Format Specification
************************************

In different contexts, "netCDF" may refer to an abstract data model, a
software implementation with associated application program interfaces
(APIs), or a data format. Confusion may easily arise in discussions of
different versions of the data models, software, and formats, because
the relationships among versions of these entities is more complex than
a simple one-to-one correspondence by version. For example,
compatibility commitments require that new versions of the software
support all previous variants of the format and data model.

   To avoid this potential confusion, we assign distinct names to
versions of the formats, data models, and software releases that will
be used consistently in the remainder of this appendix.

   In this appendix, two format variants are specified formally, the
"classic format" and the "64-bit offset format" for netCDF data. Two
additional format variants are discussed less formally, the "netCDF-4
format" and the "netCDF-4 classic model format".

   The classic format was the only format for netCDF data created
between 1989 and 2004 by various versions of the reference software from
Unidata. In 2004, the 64-bit offset format variant was introduced for
creation of and access to much larger files. The reference software,
available for C-based and Java-based programs, supported use of the
same APIs for accessing either classic or 64-bit offset files, so
programs reading the files would not have to depend on which format was
used.

   There are only two netCDF data models, the "classic model" and the
"enhanced model". The classic model is the simpler of the two, and is
used for all data stored in classic format, 64-bit offset format, or
netCDF-4 classic model format. The enhanced model (also referred to as
the netCDF- 4 data model) was introduced in 2008 as an extension of the
classic model that adds more powerful forms of data representation and
data types at the expense of some additional complexity. Although data
represented with the classic model can also be represented using the
enhanced model, datasets that use features of the enhanced model, such
as user-defined nested data types, cannot be represented with the
classic model. Use of added features of the enhanced model requires
that data be stored in the netCDF-4 format.

   Versions 1.0 through 3.5 of the Unidata C-based reference software,
released between 1989 and 2000, supported only the classic data model
and classic format. Version 3.6, released in late 2004, first provided
support for the 64-bit offset format, but still used the classic data
model.  With version 4.0, released in 2008, the enhanced data model was
introduced along with the two new HDF5-based format variants, the
netCDF-4 format and the netCDF-4 classic model format.  Evolution of
the data models, formats, and APIs will continue the commitment to
support all previous netCDF data models, data format variants, and APIs
in future software releases.

   Use of the HDF5 storage layer in netCDF-4 software provides features
for improved performance, independent of the data model used, for
example compression and dynamic schema changes. Such performance
improvements are available for data stored in the netCDF-4 classic
model format, even when accessed by programs that only support the
classic model.

   A full specification of the two netCDF-4 formats in terms of the
underlying HDF5 storage layer is outside the scope of this appendix.
Other related formats not discussed in this appendix include CDL
("Common Data Language", the original ASCII form of binary netCDF
data), and NcML (NetCDF Markup Language, an XML-based representation
for netCDF metadata and data).

   Knowledge of format details is not required to read or write netCDF
datasets. Software that reads netCDF data using the reference
implementation automatically detects and uses the correct version of
the format for accessing data. Understanding details may be helpful for
understanding performance issues related to disk or server access.

   The netCDF reference library, developed and supported by Unidata, is
written in C, with Fortran77, Fortran90, and C++ interfaces. A number
of community and commercially supported interfaces to other languages
are also available, including IDL, Matlab, Perl, Python, and Ruby.  An
independent implementation, also developed and supported by Unidata, is
written entirely in Java.

C.1 The NetCDF-4 Format
=======================

The netCDF-4 format implements and expands the netCDF-3 data model by
using an enhanced version of HDF5 as the storage layer.  Use is made of
features that are only available in HDF5 version 1.8 and later.

   Using HDF5 as the underlying storage layer, netCDF-4 files remove
many of the restrictions for classic and 64-bit offset files.  The
richer enhanced model supports user-defined types and data structures,
hierarchical scoping of names using groups, more primitive types
including strings, larger variable sizes, and multiple unlimited
dimensions.  The underlying HDF5 storage layer also supports
per-variable compression, multidimensional tiling, and efficient
dynamic schema changes, so that data need not be copied when adding new
variables to the file schema.

   Although every file in netCDF-4 format is an HDF5 file, there are
HDF5 files that are not netCDF-4 format files, because the netCDF-4
format intentionally uses a limited subset of the HDF5 data model and
file format features.  Some HDF5 features not supported in the netCDF
enhanced model and netCDF-4 format include non-hierarchical group
structures, HDF5 reference types, multiple links to a data object,
user-defined atomic data types, stored property lists, more permissive
rules for data object names, the HDF5 date/time type, and attributes
associated with user-defined types.

   A complete specification of HDF5 files is beyond the scope of this
document.  For more information about HDF5, see the HDF5 web site:
`http://hdf.ncsa.uiuc.edu/HDF5/'.

C.2 The NetCDF-4 Classic Model Format
=====================================

Every classic and 64-bit offset file can be represented as a netCDF-4
file, with no loss of information.  There are some significant benefits
to using the simpler netCDF classic model with the netCDF-4 file
format.  For example, software that writes or reads classic model data
can write or read netCDF-4 classic model format data by
recompiling/relinking to a netCDF-4 API library, with no or only
trivial changes needed to the program source code.  The netCDF-4
classic model format supports this usage by enforcing rules on what
functions may be called to store data in the file, to make sure its
data can be read by older netCDF applications (when relinked to a
netCDF-4 library).

   Writing data in this format prevents use of enhanced model features
such as groups, added primitive types not available in the classic
model, and user-defined types.  However performance features of the
netCDF-4 formats that do not require additional features of the
enhanced model, such as per-variable compression and chunking,
efficient dynamic schema changes, and larger variable size limits,
offer potentially significant performance improvements to readers of
data stored in this format, without requiring program changes.

C.3 The NetCDF Classic Format
=============================

To present the format more formally, we use a BNF grammar notation. In
this notation:

   * Non-terminals (entities defined by grammar rules) are in lower
     case.

   * Terminals (atomic entities in terms of which the format
     specification is written) are in upper case, and are specified
     literally as US-ASCII characters within single-quote characters or
     are described with text between angle brackets (`<' and `>').

   * Optional entities are enclosed between braces (`[' and `]').

   * A sequence of zero or more occurrences of an entity is denoted by
     `[entity ...]'.

   * A vertical line character (`|') separates alternatives. Alternation
     has lower precedence  than concatenation.

   * Comments follow `//' characters.

   * A single byte that is not a printable character is denoted using a
     hexadecimal number with the notation `\xDD', where each D is a
     hexadecimal digit.

   * A literal single-quote character is denoted by `\'', and a literal
     back-slash character is denoted by `\\'.

   Following the grammar, a few additional notes are included to specify
format characteristics that are impractical to capture in a BNF
grammar, and to note some special cases for implementers.  Comments in
the grammar point to the notes and special cases, and help to clarify
the intent of elements of the format.

The Format in Detail
--------------------

     netcdf_file  = header  data
     header       = magic  numrecs  dim_list  gatt_list  var_list
     magic        = 'C'  'D'  'F'  VERSION
     VERSION      = \x01 |                      // classic format
                    \x02                        // 64-bit offset format
     numrecs      = NON_NEG | STREAMING         // length of record dimension
     dim_list     = ABSENT | NC_DIMENSION  nelems  [dim ...]
     gatt_list    = att_list                    // global attributes
     att_list     = ABSENT | NC_ATTRIBUTE  nelems  [attr ...]
     var_list     = ABSENT | NC_VARIABLE   nelems  [var ...]
     ABSENT       = ZERO  ZERO                  // Means list is not present
     ZERO         = \x00 \x00 \x00 \x00         // 32-bit zero
     NC_DIMENSION = \x00 \x00 \x00 \x0A         // tag for list of dimensions
     NC_VARIABLE  = \x00 \x00 \x00 \x0B         // tag for list of variables
     NC_ATTRIBUTE = \x00 \x00 \x00 \x0C         // tag for list of attributes
     nelems       = NON_NEG       // number of elements in following sequence
     dim          = name  dim_length
     name         = nelems  namestring
                           // Names a dimension, variable, or attribute.
                           // Names should match the regular expression
                           // ([a-zA-Z_]|{MUTF8})([^\x00-\x1F/\x7F-\xFF]|{MUTF8})*
                           // For other constraints, see "Note on names",  below.
     namestring   = ID1 [IDN ...]
     ID1          = alpha | '_'
     IDN          = alpha | numeric | special1 | special2
     alpha        = lowercase | uppercase | MUTF8
     lowercase    = 'a'|'b'|'c'|'d'|'e'|'f'|'g'|'h'|'i'|'j'|'k'|'l'|'m'|
                    'n'|'o'|'p'|'q'|'r'|'s'|'t'|'u'|'v'|'w'|'x'|'y'|'z'
     uppercase    = 'A'|'B'|'C'|'D'|'E'|'F'|'G'|'H'|'I'|'J'|'K'|'L'|'M'|
                    'N'|'O'|'P'|'Q'|'R'|'S'|'T'|'U'|'V'|'W'|'X'|'Y'|'Z'
     numeric      = '0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9'
                                  // special1 chars have traditionally been
                                  // permitted in netCDF names.
     special1     = '_'|'.'|'@'|'+'|'-'
                                  // special2 chars are recently permitted in
                                  // names (and require escaping in CDL).
     special2     = ' ' | '!' | '"' | '#' | '$' | '%' | '&' | '\'' |
                    '(' | ')' | '*' | ',' | ':' | ';' | '<' | '=' |
                    '>' | '?' | '[' | '\\' | ']' | '^' | '`' | '{' |
                    '|' | '}' | '~'            // Note: '/' is not permitted
     MUTF8        = <multibyte UTF-8 encoded, NFC-normalized Unicode character>
     dim_length   = NON_NEG       // If zero, this is the record dimension.
                                  // There can be at most one record dimension.
     attr         = name  nc_type  nelems  [values ...]
     nc_type      = NC_BYTE   |
                    NC_CHAR   |
                    NC_SHORT  |
                    NC_INT    |
                    NC_FLOAT  |
                    NC_DOUBLE
     var          = name  nelems  [dimid ...]  vatt_list  nc_type  vsize  begin
                                  // nelems is the dimensionality (rank) of the
                                  // variable: 0 for scalar, 1 for vector, 2
                                  // for matrix, ...
     dimid        = NON_NEG       // Dimension ID (index into dim_list) for
                                  // variable shape.  We say this is a "record
                                  // variable" if and only if the first
                                  // dimension is the record dimension.
     vatt_list    = att_list      // Variable-specific attributes
     vsize        = NON_NEG       // Variable size.  If not a record variable,
                                  // the amount of space in bytes allocated to
                                  // the variable's data.  If a record variable,
                                  // the amount of space per record.  See "Note
                                  // on vsize", below.
     begin        = OFFSET        // Variable start location.  The offset in
                                  // bytes (seek index) in the file of the
                                  // beginning of data for this variable.
     data         = non_recs  recs
     non_recs     = [vardata ...] // The data for all non-record variables,
                                  // stored contiguously for each variable, in
                                  // the same order the variables occur in the
                                  // header.
     vardata      = [values ...]  // All data for a non-record variable, as a
                                  // block of values of the same type as the
                                  // variable, in row-major order (last
                                  // dimension varying fastest).
     recs         = [record ...]  // The data for all record variables are
                                  // stored interleaved at the end of the
                                  // file.
     record       = [varslab ...] // Each record consists of the n-th slab
                                  // from each record variable, for example
                                  // x[n,...], y[n,...], z[n,...] where the
                                  // first index is the record number, which
                                  // is the unlimited dimension index.
     varslab      = [values ...]  // One record of data for a variable, a
                                  // block of values all of the same type as
                                  // the variable in row-major order (last
                                  // index varying fastest).
     values       = bytes | chars | shorts | ints | floats | doubles
     string       = nelems  [chars]
     bytes        = [BYTE ...]  padding
     chars        = [CHAR ...]  padding
     shorts       = [SHORT ...]  padding
     ints         = [INT ...]
     floats       = [FLOAT ...]
     doubles      = [DOUBLE ...]
     padding      = <0, 1, 2, or 3 bytes to next 4-byte boundary>
                                  // Header padding uses null (\x00) bytes.  In
                                  // data, padding uses variable's fill value.
                                  // See "Note on padding", below, for a special
                                  // case.
     NON_NEG      = <non-negative INT>
     STREAMING    = \xFF \xFF \xFF \xFF   // indicates indeterminate record
                                          // count, allows streaming data
     OFFSET       = <non-negative INT> |  // for classic format or
                    <non-negative INT64>  // for 64-bit offset format
     BYTE         = <8-bit byte>          // See "Note on byte data", below.
     CHAR         = <8-bit byte>          // See "note on char data", below.
     SHORT        = <16-bit signed integer, Bigendian, two's complement>
     INT          = <32-bit signed integer, Bigendian, two's complement>
     INT64        = <64-bit signed integer, Bigendian, two's complement>
     FLOAT        = <32-bit IEEE single-precision float, Bigendian>
     DOUBLE       = <64-bit IEEE double-precision float, Bigendian>
                                  // following type tags are 32-bit integers
     NC_BYTE      = \x00 \x00 \x00 \x01       // 8-bit signed integers
     NC_CHAR      = \x00 \x00 \x00 \x02       // text characters
     NC_SHORT     = \x00 \x00 \x00 \x03       // 16-bit signed integers
     NC_INT       = \x00 \x00 \x00 \x04       // 32-bit signed integers
     NC_FLOAT     = \x00 \x00 \x00 \x05       // IEEE single precision floats
     NC_DOUBLE    = \x00 \x00 \x00 \x06       // IEEE double precision floats
                                  // Default fill values for each type, may be
                                  // overridden by variable attribute named
                                  // `_FillValue'. See "Note on fill values",
                                  // below.
     FILL_CHAR    = \x00                      // null byte
     FILL_BYTE    = \x81                      // (signed char) -127
     FILL_SHORT   = \x80 \x01                 // (short) -32767
     FILL_INT     = \x80 \x00 \x00 \x01       // (int) -2147483647
     FILL_FLOAT   = \x7C \xF0 \x00 \x00       // (float) 9.9692099683868690e+36
     FILL_DOUBLE  = \x47 \x9E \x00 \x00 \x00 \x00 //(double)9.9692099683868690e+36

   Note on vsize: This number is the product of the dimension lengths
(omitting the record dimension) and the number of bytes per value
(determined from the type), increased to the next multiple of 4, for
each variable.  If a record variable, this is the amount of space per
record.  The netCDF "record size" is calculated as the sum of the
vsize's of all the record variables.

   The vsize field is actually redundant, because its value may be
computed from other information in the header. The 32-bit vsize field
is not large enough to contain the size of variables that require more
than 2^32 - 4 bytes, so 2^32 - 1 is used in the vsize field for such
variables.

   Note on names: Earlier versions of the netCDF C-library reference
implementation enforced a more restricted set of characters in creating
new names, but permitted reading names containing arbitrary bytes.
This specification extends the permitted characters in names to include
multibyte UTF-8 encoded Unicode and additional printing characters from
the US-ASCII alphabet. The first character of a name must be
alphabetic, a multibyte UTF-8 character, or '_' (traditionally reserved
for names with meaning to implementations, such as the "_FillValue"
attribute).  Subsequent characters may also include numerals and
printing special characters, except for '/' which is not allowed in
names.  Names that have trailing space characters are also not
permitted.

   Implementations of the netCDF classic and 64-bit offset format must
ensure that names are normalized according to Unicode NFC normalization
rules during encoding as UTF-8 for storing in the file header.  This is
necessary to ensure that gratuitous differences in the representation
of Unicode names do not cause anomalies in comparing files and querying
data objects by name.

   Note on streaming data: The largest possible record count, 2^32 - 1,
is reserved to indicate an indeterminate number of records.  This means
that the number of records in the file must be determined by other
means, such as reading them or computing the current number of records
from the file length and other information in the header.  It also
means that the numrecs field in the header will not be updated as
records are added to the file.  [This feature is not yet implemented].

   Note on padding:  In the special case of only a single record
variable of character, byte, or short type, no padding is used between
data values.

   Note on byte data: It is possible to interpret byte data as either
signed (-128 to 127) or unsigned (0 to 255). When reading byte data
through an interface that converts it into another numeric type, the
default interpretation is signed.  There are various attribute
conventions for specifying whether bytes represent signed or unsigned
data, but no standard convention has been established.  The variable
attribute "_Unsigned" is reserved for this purpose in future
implementations.

   Note on char data: Although the characters used in netCDF names must
be encoded as UTF-8, character data may use other encodings. The
variable attribute "_Encoding" is reserved for this purpose in future
implementations.

   Note on fill values: Because data variables may be created before
their values are written, and because values need not be written
sequentially in a netCDF file, default "fill values" are defined for
each type, for initializing data values before they are explicitly
written.  This makes it possible to detect reading values that were
never written.  The variable attribute "_FillValue", if present,
overrides the default fill value for a variable. If _FillValue is
defined then it should be scalar and of the same type as the variable.

   Fill values are not required, however, because netCDF libraries have
traditionally supported a "no fill" mode when writing, omitting the
initialization of variable values with fill values. This makes the
creation of large files faster, but also eliminates the possibility of
detecting the inadvertent reading of values that haven't been written.

Notes on Computing File Offsets
-------------------------------

The offset (position within the file) of a specified data value in a
classic format or 64-bit offset data file is completely determined by
the variable start location (the offset in the `begin' field), the
external type of the variable (the `nc_type' field), and the dimension
indices (one for each of the variable's dimensions) of the value
desired.

   The external size in bytes of one data value for each possible
netCDF type, denoted `extsize' below, is:

   NC_BYTE         1 NC_CHAR         1 NC_SHORT        2 NC_INT
4 NC_FLOAT        4 NC_DOUBLE       8

   The record size, denoted by `recsize' below, is the sum of the
`vsize' fields of record variables (variables that use the unlimited
dimension), using the actual value determined by dimension sizes and
variable type in case the `vsize' field is too small for the variable
size.

   To compute the offset of a value relative to the beginning of a
variable, it is helpful to precompute a "product vector" from the
dimension lengths.  Form the products of the dimension lengths for the
variable from right to left, skipping the leftmost (record) dimension
for record variables, and storing the results as the product vector for
each variable.

   For example:

   Non-record variable:

   dimension lengths:      [  5  3  2 7]         product vector:
[210 42 14 7]

   Record variable:

   dimension lengths:      [0  2  9 4]         product vector:
[0 72 36 4]

   At this point, the leftmost product, when rounded up to the next
multiple of 4, is the variable size, `vsize', in the grammar above. For
example, in the non-record variable above, the value of the `vsize'
field is 212 (210 rounded up to a multiple of 4). For the record
variable, the value of `vsize' is just 72, since this is already a
multiple of 4.

   Let coord be the array of coordinates (dimension indices, zero-based)
of the desired data value.  Then the offset of the value from the
beginning of the file is just the file offset of the first data value
of the desired variable (its `begin' field) added to the inner product
of the coord and product vectors times the size, in bytes, of each
datum for the variable. Finally, if the variable is a record variable,
the product of the record number, 'coord[0]', and the record size,
`recsize', is added to yield the final offset value.

   A special case: Where there is exactly one record variable, we drop
the requirement that each record be four-byte aligned, so in this case
there is no record padding.

Examples
--------

By using the grammar above, we can derive the smallest valid netCDF
file, having no dimensions, no variables, no attributes, and hence, no
data. A CDL representation of the empty netCDF file is

   netcdf empty { }

   This empty netCDF file has 32 bytes. It begins with the four-byte
"magic number" that identifies it as a netCDF version 1 file: `C', `D',
`F', `\x01'. Following are seven 32-bit integer zeros representing the
number of records, an empty list of dimensions, an empty list of global
attributes, and an empty list of variables.

   Below is an (edited) dump of the file produced using the Unix command

   od -xcs empty.nc

   Each 16-byte portion of the file is displayed with 4 lines. The first
line displays the bytes in hexadecimal. The second line displays the
bytes as characters. The third line displays each group of two bytes
interpreted as a signed 16-bit integer. The fourth line (added by
human) presents the interpretation of the bytes in terms of netCDF
components and values.

        4344    4601    0000    0000    0000    0000    0000    0000
       C   D   F 001  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
       17220   17921   00000   00000   00000   00000   00000   00000
     [magic number ] [  0 records  ] [  0 dimensions   (ABSENT)    ]

        0000    0000    0000    0000    0000    0000    0000    0000
      \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
       00000   00000   00000   00000   00000   00000   00000   00000
     [  0 global atts  (ABSENT)    ] [  0 variables    (ABSENT)    ]

   As a less trivial example, consider the CDL

     netcdf tiny {
     dimensions:
             dim = 5;
     variables:
             short vx(dim);
     data:
             vx = 3, 1, 4, 1, 5 ;
     }

   which corresponds to a 92-byte netCDF file. The following is an
edited dump of this file:

        4344    4601    0000    0000    0000    000a    0000    0001
       C   D   F 001  \0  \0  \0  \0  \0  \0  \0  \n  \0  \0  \0 001
       17220   17921   00000   00000   00000   00010   00000   00001
     [magic number ] [  0 records  ] [NC_DIMENSION ] [ 1 dimension ]

        0000    0003    6469    6d00    0000    0005    0000    0000
      \0  \0  \0 003   d   i   m  \0  \0  \0  \0 005  \0  \0  \0  \0
       00000   00003   25705   27904   00000   00005   00000   00000
     [  3 char name = "dim"        ] [ size = 5    ] [ 0 global atts

        0000    0000    0000    000b    0000    0001    0000    0002
      \0  \0  \0  \0  \0  \0  \0 013  \0  \0  \0 001  \0  \0  \0 002
       00000   00000   00000   00011   00000   00001   00000   00002
      (ABSENT)     ] [NC_VARIABLE  ] [ 1 variable  ] [ 2 char name =

        7678    0000    0000    0001    0000    0000    0000    0000
       v   x  \0  \0  \0  \0  \0 001  \0  \0  \0  \0  \0  \0  \0  \0
       30328   00000   00000   00001   00000   00000   00000   00000
      "vx"         ] [1 dimension  ] [ with ID 0   ] [ 0 attributes

        0000    0000    0000    0003    0000    000c    0000    0050
      \0  \0  \0  \0  \0  \0  \0 003  \0  \0  \0  \f  \0  \0  \0   P
       00000   00000   00000   00003   00000   00012   00000   00080
      (ABSENT)     ] [type NC_SHORT] [size 12 bytes] [offset:    80]

        0003    0001    0004    0001    0005    8001
      \0 003  \0 001  \0 004  \0 001  \0 005 200 001
       00003   00001   00004   00001   00005  -32767
     [    3] [    1] [    4] [    1] [    5] [fill ]

C.4 The 64-bit Offset Format
============================

The netCDF 64-bit offset format differs from the classic format only in
the VERSION byte, `\x02' instead of `\x01', and the OFFSET entity, a
64-bit instead of a 32-bit offset from the beginning of the file.  This
small format change permits much larger files, but there are still some
practical size restrictions.  Each fixed-size variable and the data for
one record's worth of each record variable are still limited in size to
a little less that 4 GiB.  The rationale for this limitation is to
permit aggregate access to all the data in a netCDF variable (or a
record's worth of data) on 32-bit platforms.

Index
*****

64-bit offset file format:                     See 4.1.      (line 1644)
64-bit offset format, introduction:            See 4.3.      (line 1750)
64-bit offset format, limitations:             See 4.4.      (line 1800)
64-bit offsets, history:                       See 1.8.      (line  409)
_FillValue:                                    See Appendix B.
                                                             (line 2724)
_IONBF flag:                                   See 4.6.      (line 1894)
access C example of array section:             See 3.2.2.    (line 1292)
access Fortran example of array section:       See 3.2.4.    (line 1416)
access random:                                 See 3.2.      (line 1208)
access shared dataset I/O:                     See 4.6.      (line 1894)
ADA API, history:                              See 1.8.      (line  409)
add_offset:                                    See Appendix B.
                                                             (line 2705)
ancillary data as attributes:                  See 2.5.      (line 1100)
ancillary data, storing:                       See 2.4.      (line 1022)
API, C <1>:
          See ``Summary''.                                   (line  122)
API, C:
          See ``NetCDF Users Guide''.                        (line    4)
API, C++ <1>:
          See ``Summary''.                                   (line  122)
API, C++:
          See ``NetCDF Users Guide''.                        (line    4)
API, F90:
          See ``Summary''.                                   (line  122)
API, Fortran:
          See ``Summary''.                                   (line  122)
API, Fortran 77:
          See ``NetCDF Users Guide''.                        (line    4)
API, Fortran 90:
          See ``NetCDF Users Guide''.                        (line    4)
API, Java:
          See ``Summary''.                                   (line  122)
appending data along unlimited dimension:      See 2.2.      (line  876)
applications, generic:                         See 2.4.      (line 1022)
applications, generic, conventions <1>:        See Appendix B.
                                                             (line 2644)
applications, generic, conventions:            See 1.7.      (line  384)
applications, generic, reasons for netCDF:     See 5.        (line 1994)
applications, generic, units:                  See Appendix A.
                                                             (line 2545)
archive format:                                See 1.6.      (line  372)
Argonne National Laboratory:                   See 1.8.      (line  409)
array section, C example:                      See 3.2.2.    (line 1292)
array section, corner:                         See 3.2.1.    (line 1237)
array section, definition:                     See 3.2.1.    (line 1237)
array section, edges:                          See 3.2.1.    (line 1237)
array section, Fortran example:                See 3.2.4.    (line 1416)
array section, mapped:                         See 3.2.1.    (line 1237)
arrays, ragged:                                See 1.10.     (line  597)
ASCII characters:                              See 3.1.      (line 1137)
attribute conventions:                         See Appendix B.
                                                             (line 2644)
attributes associated with a variable:         See 2.3.      (line  931)
attributes vs. variables:                      See 2.5.      (line 1100)
attributes, adding to existing dataset:        See 2.4.      (line 1022)
attributes, CDL, defining:                     See 5.1.      (line 2026)
attributes, CDL, global:                       See 5.1.      (line 2026)
attributes, CDL, initializing:                 See 5.3.      (line 2209)
attributes, data type:                         See 2.4.      (line 1022)
attributes, data types, CDL:                   See 5.3.      (line 2209)
attributes, defined:                           See 2.4.      (line 1022)
attributes, defining in CDL:                   See 2.4.      (line 1022)
attributes, global:                            See 2.4.      (line 1022)
attributes, length, CDL:                       See 5.3.      (line 2209)
attributes, operations on:                     See 2.4.      (line 1022)
buffers, I/O:                                  See 4.6.      (line 1894)
byte:                                          See 5.2.      (line 2137)
byte array vs. text string:                    See 3.3.      (line 1517)
byte CDL constant:                             See 5.3.      (line 2209)
byte, CDL data type:                           See 5.2.      (line 2132)
byte, signed vs. unsigned:                     See 3.1.      (line 1137)
C API <1>:
          See ``Summary''.                                   (line  122)
C API:
          See ``NetCDF Users Guide''.                        (line    4)
C code via ncgen, generating:                  See 5.4.      (line 2291)
C++ API <1>:
          See ``Summary''.                                   (line  122)
C++ API:
          See ``NetCDF Users Guide''.                        (line    4)
C_format:                                      See Appendix B.
                                                             (line 2766)
CANDIS:                                        See 1.8.      (line  409)
CDF1:                                          See 4.3.      (line 1750)
CDF2:                                          See 4.3.      (line 1750)
CDL attributes, defining:                      See 5.1.      (line 2026)
CDL constants:                                 See 5.3.      (line 2209)
CDL data types:                                See 5.2.      (line 2132)
CDL dimensions, defining:                      See 5.1.      (line 2026)
CDL syntax:                                    See 5.1.      (line 2026)
CDL variables, defining:                       See 5.1.      (line 2026)
CDL, defining attributes:                      See 2.4.      (line 1022)
CDL, defining global attributes:               See 2.4.      (line 1022)
CDL, example:                                  See 2.1.      (line  764)
char:                                          See 5.2.      (line 2134)
char, CDL data type:                           See 5.2.      (line 2132)
classic file format:                           See 4.1.      (line 1644)
classic format, introduction:                  See 4.3.      (line 1750)
classic format, limitations:                   See 4.5.      (line 1831)
classic netCDF format:                         See 1.10.     (line  597)
common data form language:                     See 2.1.      (line  764)
compression:                                   See 1.6.      (line  372)
Conventions:                                   See Appendix B.
                                                             (line 2794)
conventions, attributes:                       See Appendix B.
                                                             (line 2644)
conventions, introduction:                     See 1.7.      (line  384)
conventions, naming:                           See 2.1.      (line  764)
conversion of data types, introduction:        See 3.1.      (line 1137)
coordinate variables:                          See 2.3.      (line  931)
data base:                                     See 1.2.      (line  221)
data model, netCDF:                            See 2.1.      (line  764)
data structures:                               See 3.4.      (line 1578)
data types, conversion:                        See 3.3.      (line 1517)
data types, external:                          See 3.1.      (line 1137)
data, reading:                                 See 3.2.      (line 1208)
data, writing:                                 See 3.2.      (line 1208)
DBMS:                                          See 1.2.      (line  221)
differences between attributes and variables:  See 2.5.      (line 1100)
dimensions, CDL, defining:                     See 5.1.      (line 2026)
dimensions, CDL, initializing:                 See 5.3.      (line 2209)
dimensions, introduction:                      See 2.2.      (line  876)
dimensions, length, CDL:                       See 5.3.      (line 2209)
dimensions, unlimited:                         See 2.2.      (line  876)
DODS:                                          See 1.8.      (line  409)
double:                                        See 5.2.      (line 2155)
double, CDL data type:                         See 5.2.      (line 2132)
external data types:                           See 3.1.      (line 1137)
F90 API:
          See ``Summary''.                                   (line  122)
FAN <1>:                                       See 5.        (line 1994)
FAN:                                           See 1.8.      (line  409)
fflush:                                        See 4.6.      (line 1894)
file format:                                   See Appendix C.
                                                             (line 2828)
file format, 64-bit offset:                    See 4.1.      (line 1644)
file format, classic:                          See 4.1.      (line 1644)
file structure, overview:                      See 4.        (line 1620)
float:                                         See 5.2.      (line 2149)
float, CDL data type:                          See 5.2.      (line 2132)
flushing buffers:                              See 4.6.      (line 1894)
format selection advice:                       See 1.4.      (line  284)
Fortran 77 API:
          See ``NetCDF Users Guide''.                        (line    4)
Fortran 90 API:
          See ``NetCDF Users Guide''.                        (line    4)
Fortran API:
          See ``Summary''.                                   (line  122)
FORTRAN_format:                                See Appendix B.
                                                             (line 2776)
future plans for netCDF:                       See 1.11.     (line  673)
GBytes:                                        See 1.10.     (line  597)
generating C code via ncgen:                   See 5.4.      (line 2291)
generic applications:                          See 2.4.      (line 1022)
GiBytes:                                       See 1.10.     (line  597)
global attributes:                             See 2.4.      (line 1022)
history:                                       See Appendix B.
                                                             (line 2787)
I/O layer:                                     See 4.6.      (line 1894)
initializing CDL:                              See 5.3.      (line 2209)
int:                                           See 5.2.      (line 2143)
int, CDL data type:                            See 5.2.      (line 2132)
Interface Guide, C:
          See ``NetCDF Users Guide''.                        (line    4)
Interface Guide, C++:
          See ``NetCDF Users Guide''.                        (line    4)
Interface Guide, Fortran 77:
          See ``NetCDF Users Guide''.                        (line    4)
Interface Guide, Fortran 90:
          See ``NetCDF Users Guide''.                        (line    4)
Java API:
          See ``Summary''.                                   (line  122)
Java API, history:                             See 1.8.      (line  409)
large file support:                            See 4.3.      (line 1750)
LFS:                                           See 4.3.      (line 1750)
limitations of netCDF:                         See 1.10.     (line  597)
long:                                          See 5.2.      (line 2146)
long, CDL data type:                           See 5.2.      (line 2132)
long_name:                                     See Appendix B.
                                                             (line 2663)
MATLAB API, history:                           See 1.8.      (line  409)
missing_value:                                 See Appendix B.
                                                             (line 2750)
multiple unlimited dimensions:                 See 2.2.      (line  876)
naming conventions:                            See 2.1.      (line  764)
NASA CDF format:                               See 1.8.      (line  409)
NC_BYTE:                                       See 2.3.      (line  931)
NC_CHAR:                                       See 2.3.      (line  931)
NC_DOUBLE:                                     See 2.3.      (line  931)
NC_FLOAT:                                      See 2.3.      (line  931)
NC_INT:                                        See 2.3.      (line  931)
NC_LONG:                                       See 2.3.      (line  931)
NC_SHARE:                                      See 4.6.      (line 1894)
NC_SHORT:                                      See 2.3.      (line  931)
nc_sync:                                       See 4.6.      (line 1894)
ncdump:                                        See 5.5.      (line 2378)
ncdump, introduction:                          See 2.1.      (line  764)
ncdump, overview:                              See 5.        (line 1994)
ncgen:                                         See 5.4.      (line 2291)
ncgen, overview:                               See 5.        (line 1994)
ncmeta:                                        See 5.        (line 1994)
NcML:                                          See 1.8.      (line  409)
NCO:                                           See 1.8.      (line  409)
ncrob:                                         See 5.        (line 1994)
netCDF 4.0:                                    See 1.11.     (line  673)
netCDF data model:                             See 2.1.      (line  764)
netCDF data types:                             See 2.3.      (line  931)
NETCDF_FFIOSPEC:                               See 4.7.      (line 1936)
New Mexico Institute of Mining:                See 1.8.      (line  409)
new netCDF features in 3.6.2:                  See 1.9.      (line  591)
nf_byte:                                       See 2.3.      (line  931)
nf_char:                                       See 2.3.      (line  931)
nf_double:                                     See 2.3.      (line  931)
nf_float:                                      See 2.3.      (line  931)
nf_int1:                                       See 2.3.      (line  931)
nf_int2:                                       See 2.3.      (line  931)
nf_real:                                       See 2.3.      (line  931)
NF_SHARE:                                      See 4.6.      (line 1894)
nf_short:                                      See 2.3.      (line  931)
NF_SYNC:                                       See 4.6.      (line 1894)
Northwestern University:                       See 1.8.      (line  409)
OPeNDAP:                                       See 1.8.      (line  409)
operations on attributes:                      See 2.4.      (line 1022)
performance of NetCDF:                         See 4.        (line 1620)
performance, introduction:                     See 1.5.      (line  347)
plans for netCDF:                              See 1.11.     (line  673)
primary variables:                             See 2.3.      (line  931)
python API, history:                           See 1.8.      (line  409)
real:                                          See 5.2.      (line 2152)
real, CDL data type:                           See 5.2.      (line 2132)
references:                                    See 1.12.     (line  703)
ruby API, history:                             See 1.8.      (line  409)
scale_factor:                                  See Appendix B.
                                                             (line 2700)
SeaSpace, Inc:                                 See 1.8.      (line  409)
share flag:                                    See 4.6.      (line 1894)
shared dataset I/O access:                     See 4.6.      (line 1894)
short:                                         See 5.2.      (line 2140)
short, CDL data type:                          See 5.2.      (line 2132)
signedness:                                    See Appendix B.
                                                             (line 2758)
SNIDE:                                         See 1.8.      (line  409)
software list:                                 See 5.        (line 1994)
storing ancillary data:                        See 2.4.      (line 1022)
structures, data:                              See 3.4.      (line 1578)
supported programming languages:
          See ``Summary''.                                   (line  122)
Tcl/Tk API, history:                           See 1.8.      (line  409)
Terascan data format:                          See 1.8.      (line  409)
title:                                         See Appendix B.
                                                             (line 2783)
type conversion:                               See 3.3.      (line 1517)
udunits:                                       See Appendix A.
                                                             (line 2545)
UNICOS:                                        See 4.7.      (line 1936)
units:                                         See Appendix B.
                                                             (line 2653)
units library:                                 See Appendix A.
                                                             (line 2545)
University of Miami:                           See 1.8.      (line  409)
unlimited dimensions:                          See 2.2.      (line  876)
utilities:                                     See 5.        (line 1994)
valid_max:                                     See Appendix B.
                                                             (line 2671)
valid_min:                                     See Appendix B.
                                                             (line 2668)
valid_range:                                   See Appendix B.
                                                             (line 2674)
variable types:                                See 2.3.      (line  931)
variables vs. attributes:                      See 2.5.      (line 1100)
variables, CDL, defining:                      See 5.1.      (line 2026)
variables, CDL, initializing:                  See 5.3.      (line 2209)
variables, coordinate:                         See 2.3.      (line  931)
variables, data types, CDL:                    See 5.3.      (line 2209)
variables, defined:                            See 2.3.      (line  931)
variables, primary:                            See 2.3.      (line  931)
WetCDF, history:                               See 1.8.      (line  409)
workshop, CDF:                                 See 1.8.      (line  409)
writers, multiple:                             See 1.10.     (line  597)
XDR format:                                    See 1.3.      (line  249)
XDR layer:                                     See 4.2.      (line 1730)
XDR, introduction into netCDF:                 See 1.8.      (line  409)
