Personal tools
You are here: Home Documentation Users Guide 2. Data Set Basics 2.2 NETCDF DATA
Document Actions

2.2 NETCDF DATA

by Catherine Nunkai — last modified 2014-11-13 13:52

The Network Common Data Format (NetCDF) is an interface to a library of data access routines for storing and retrieving scientific data. NetCDFallows the creation of data sets which are self-describing and platform-independent. netCDFwas created under contract with the Division of Atmospheric Sciences of the National Scientific Foundation and is available from the Unidata Program Center in Boulder, Colorado.

Ferret expects netCDF files to adhere to the COARDS conventions.  Also Ferret treatment of netCDF files is consistent with the CF standard for netCDF files, http://cfconventions.org.  Ferret does not implement all of that standard.


See the chapter "Converting Data to NetCDF", for a complete description of how to create netCDFdata sets or how to convert existing data sets into NetCDF.

To output a variable in NetCDF, simply use:

yes? LIST/FORMAT=CDF variable_name

LIST/FORMAT=CDF (alias SAVE) can also be used with abstract variables:

yes? SAVE/FILE=example.cdf/I=1:100 sin(I/100)

This will create a file named example.cdf.

The current region and data sets determine the variable names in the saved file and the range over which they are saved. Saved data can then be accessed as follows:

yes? USE example

(USE is an alias for SET DATA/FORMAT=CDF, see )

To read a netCDF dataset that is on a DODS (also known as OPeNDAP) server, simply specify the DODS address in quotes:

yes? use "http://www.ferret.noaa.gov/cgi-bin/nph-nc/data/coads_climatology.nc"

To check whether a dataset exists or a URL is available, the TEST_OPENDAP function returns the flag that the netCDF library sends back on an attempt to open the data.

If a filename is not specified for writing, Ferret will generate one. (See the command SET LIST/FILE in the Commands Reference section). An example of converting TMAP-formatted data to netCDF goes as follows:

yes? SET DATA coads_climatology
yes? SAVE/L=1 sst,airt,uwnd,vwnd

These commands will save sst, airt, uwnd, and vwnd at the first time step over their entire regions to a netCDF file named by Ferret.

One advantage to using netCDF is that users on a different system (i.e., VMS instead of Unix) with different software (i.e., with an analysis tool other than Ferret) can share data easily without substantial conversion work. NetCDF files are self-describing; with a simple command the size, shape and description of all variables, grids and axes can be seen.


2.1.1 NetCDF data and strides

Beginning with Ferret version 5.1 , the internal functioning of netCD Freads has been changed when "strides" are involved. Suppose that CDFVAR represent a variable from a netCDFfile. In version 5.0 and earlier the command PLOT CDFVAR[L=1:1000:10] would have read the entire array of 1000 points from the file; Ferret's internal logic would have subsampled every 10th point from the resulting array in a manner that was consistent for netCDF variables, ASCII variables, user defined variables, etc. In V5.1 strides applied to netCDF variables are given special treatment -- subsampling is done by the netCDF library. The primary benefit of this is to make network access to remote data sets via DODS more efficient. Beginning with Ferret v5.4, strides can be applied across the "branch point" of a modulo variable without loss of efficiency for the netCDF data set, as long as the stride is an integer fraction of the modulo length times the number of points on the axis. A remote satellite image of size, say, 1000x1000 points x 8 bit depth (8 megabytes) can efficiently be previewed using

SHADE DODS_VAR[i=1:1000:10,j=1:1000:10]

If a grid or axis from a netCDF file is used in the definition of a LET-defined variable (e.g. LET my_X = X[g=sst[D=coads_climatology]]) that variable definition will be invalidated when the data set is canceled (CANCEL DATA coads_climtology, in the preceding example). There is a single exception to this behavior: netCDF files such as climtological_axes.cdf, which define grids or axes that are not actually used by any variables. These grids and axes will remain defined even after the data set, itself, has been canceled. They may be deleted with explicit use of CANCEL GRID or CANCEL AXIS.

In Ferret version 6.02 we introduce a method whereby a grid may be redefined with strided axes. This "native stride" syntax means that the stride information needs to be specified only once, and variable names do not need to be changed.

Old syntax:

yes? SET DATA mydat.nc
yes? LET strided_var = var[i=1:1000:10,j=1:1000:10]

yes? FILL strided_var ! Use the new name strided_var everywhere.

New syntax:

yes? SET DATA mydat.nc
yes? SET AXIS/STRIDE=10 `var,RETURN=xaxis`
yes? SET AXIS/STRIDE=10 `var,RETURN=yaxis`

yes? FILL var ! The original variable name can be used

An offset may be specified on the SET AXIS/STRIDE command with SET AXIS/STRIDE=/OFFSET=. The offset value must be less than the stride value, and it refers to the first index to use:

Old syntax

yes? SET DATA mydat.nc
yes? LET strided_var = var[i=4:1000:10]

New syntax:

yes? SET DATA mydat.nc
yes? SET AXIS/STRIDE=10/OFFSET=4 `var,RETURN=xaxis`

This syntax associates a new strided axis with the original axis. Everywhere that original axis is used, the new strided behavior will be applied. This means that all variables from all datasets that share the same exact axis will appear on the new strided axis. The original axis definition still exists and we can cancel the stride behavior with

yes? CANCEL AXIS/STRIDE axisname


2.2.2 NetCDF data attributes

Beginning with Ferret V6.0, Ferret has access to attributes of netCDF variables, including coordinate variables. In fact, attributes can be defined and used for user variables and variables from any kind of dataset. See the section in the next chapter about dataset and variable attributes


2.2.3 NetCDF Data with the bounds attribute

The CF standard for netCDF files defines a bounds attribute for coordinate axes, where the upper and lower bounds of the grid cells along an axis are specified by a bounds variable which is of size n*2 for an axis of length N. See Section 7.1 of the CF document

http://cfconventions.org/

The coordinates on the axis may be anywhere within the cells defined by the upper and lower cell bounds. Ferret uses these as the upper and lower bounds of of axis cells (also known as boxes). They may be listed or otherwise accessed using the pseudo-variables XBOXLO, XBHOXH, YBOXLO, etc.

For example, a simple netCDF file with bounds would have the following ncdump output:

%  ncdump Fsst.cdf 
netcdf Fsst {
dimensions:
XAX = 4 ;
YAX = 4 ;
TIME = UNLIMITED ; // (1 currently)
variables:
double XAX(XAX) ;
XAX:units = "degrees_east" ;
XAX:modulo = " " ;
XAX:point_spacing = "even" ;
XAX:axis = "X" ;
double YAX(YAX) ;
YAX:units = "degrees_north" ;
YAX:point_spacing = "even" ;
YAX:axis = "Y" ;
double TIME(TIME) ;
TIME:units = "hour since 0000-01-01 00:00:00" ;
TIME:time_origin = "01-JAN-0000 00:00:00" ;
TIME:modulo = " " ;
TIME:axis = "T" ;
float SST(TIME, YAX, XAX) ;
SST:missing_value = -1.e+34f ;
SST:_FillValue = -1.e+34f ;
SST:long_name = "SEA SURFACE TEMPERATURE" ;
SST:history = "From coads_climatology" ;
SST:units = "Deg C" ;

// global attributes:
:history = "FERRET V6.08 29-Nov-07" ;
:Conventions = "CF-1.0" ;
:title = "Fsst.cdf" ;
data:

XAX = 181, 183, 185, 187 ;

YAX = 1, 3, 5, 7 ;

TIME = 366 ;

SST =
28.28389, 28.41474, 27.91529, 27.61643,
27.97333, 28.2525, 28.28312, 27.94364,
28.246, 28.06312, 28.41238, 28.1219,
27.89824, 28.10538, 27.72812, 27.64556 ;
}

The CF standard allows for axes in a file that may have discontiguous bounds (the upper bound of one cell is not the same as the lower bound of the next cell). Ferret does not allow such an axis. When discontiguous bounds are encountered in a file, we arbitrarily choose to use the lower bounds throughout, with the upper bound of the topmost cell to close the definition. This way all axes have contiguous upper and lower bounds. A warning message is issued.

DEFINE AXIS/BOUNDS may be used to create an axis with cell bounds. All irregular axes are saved with a bounds attribute (beginning with Ferret v5.70) and the user may request that all axes be written with the bounds attribute with the SAVE/BOUNDS command

Note that if you have a dataset that has an irregular time axis and a bounds attribute on that axis and you force Ferret to apply a regular time axis with

yes? USE/REGULART my_data.nc

then the bounds are ignored: the regular time axis is formed from the first and last coordinate and the number of points.


2.2.4 Multi-file NetCDF data sets

Ferret supports collections of netCDF files that are regarded as a single netCDF data set. Such data sets are referred to as "MC" (multi CDF) data sets. They are particularly useful to manage the outputs of numerical models. MC data sets use a descriptor file, in the style of TMAP-formatted data sets. The data set is referred to inside Ferret by the name of this descriptor file.

A collection of netCDF files is suitable to form a multi-file data set if

1) The files are connected through their time axis—each file represents one or more time snapshots of the variables it contains.

2) All non-time-dependent variables in the data set must be contained in the first file of the data set (or those variables will not appear in the merged, MC, data set).


Note that previous to version 5.2, each file is self-documenting with respect to the time axis of the variables—even if the time axis represents only a single point. (All of the time axes must be identically encoded with respect to units and date of the time origin.) In version 5.3 and higher these checks are not performed. This means that the MC descriptor mechanism can be used to associate into time series groups of files that are not internally self-documenting with respect to time. See Creating a Multi-File netCDF Data Set

Beginning with version 5.8 of Ferret the stepfiles may contain different scale and offset values for the variables they contain. See the section on Standardized NetCDF attributes. Ferret reads and applies the scale and offset values as data from each stepfile is read. Note that the commands

yes? SAY `var, RETURN=nc_offset`
yes? SAY `var, RETURN=nc_scale`

return the latest scale and offset value that were applied.

A typical MC descriptor file may be found in the chapter "Converting to netCDF", in the section "Creating a multi-NetCDF data set"


2.2.5 Non-standard NetCDF data sets

As discussed in the Chapter, "Converting Data to NetCDF", Ferret expects netCDF files to adhere to the COARDS conventions (http://ferret.pmel.noaa.gov/noaa_coop/coop_cdf_profile.html). If the files do not adhere to the COARDS conventions, Ferret will still attempt to access them. Often, the user can use Ferret controls for regridding, reshaping, and otherwise transforming data to recover the intended file contents.

Here are a few common ways in which netCDF files may deviate from the COARDS standard and how one may cope with those situations in Ferret.

  • Files with disordered coordinates

In the COARDS conventions an axis (a.k.a. "coordinate variable") must have monotonically-increasing coordinate values. If the coordinates are disordered or repeating in a netCDF file, then Ferret will present the coordinates to the user (in SHOW DATA) as a dependent variable, whose name is the axis name, and it will substitute an axis of the index values 1, 2, 3, ... Note that Ferret will apply this same behavior when files have long irregular axis definitions that exceed Ferret's axis memory capacity.

  • Files with reverse-ordered axes

If the coordinates of an axis are monotonically decreasing, instead of increasing, Ferret will transparently reverse both the axis coordinates and the dependent variables that are defined upon that axis. Note that if Ferret writes a reverse-ordered variable to a new netCDF file (with the SAVE command), the coordinates and data in the output file will be in monotonically increasing coordinate order—reversed from the input file.  In Ferret v6.83 and higher, Ferret will issue a NOTE when it does this:

yes? use reversed_axes.cdf
*** NOTE: Axis coordinates are decreasing-ordered. Reversing ordering for axis Y1010_REV

If the values of a dependent variable are reversed, but there is no associated coordinate axis then use attach a minus sign to the corresponding axis orientation in the USE/ORDER= qualifier to designate that the variable(s) should be reversed along the corresponding axis.

For more on this topic see the FAQ, "Are there any disadvantages to using reversed coordinates?"

  • Files with "invalid" variable names

The COARDS standard specifies that variable names should begin with a letter and be composed of letters, digits, and underscores. In files where the variable names contain other letters, references to those variable names in Ferret must be enclosed in single quotes.

  • Files with permuted axis ordering

The COARDS standard specifies that if any or all of the dimensions of a variable have the interpretations of "date or time" (a.k.a. "T"), "height or depth" (a.k.a. "Z"), "latitude" (a.k.a. "Y"), or "longitude" (a.k.a. "X") then those dimensions should appear in the relative order T, then Z, then Y, then X in the CDL definition corresponding to the file. In files where the axis ordering has been permuted the command qualifiers USE/ORDER= allow the user to inform Ferret of the correct permutation of coordinates. Note that if Ferret writes a permuted variable to a new netCDF file (with the SAVE command), the coordinates and data in the output file will be in standard X-Y-Z-T ordering (as indicated in the user’s /ORDER specification)—permuted from the original file ordering. See the Command Reference for a complete description of the ORDER qualifier.

  • Files with more than four dimensions

The COARDS standard specifies that a netCDF file may be created with more than four dimensions. However the Ferret framework allows just four dimensions at this time.


2.2.6 NetCDF and non-standard calendars

The netCDF conventions document discusses and defines usage for different calendar axes. These conventions for calendars are implemented in Ferret version 5.3 See the discussion of calendars in Section 4 of the CF Conventions document at

   http://cfconventions.org/

The calendars allowed are:

GREGORIAN or STANDARD (default)

Ferret uses the proleptic Gregorian calendar, which is the Gregorian calendar extended to dates before 1582-10-15.

NOLEAP or 365_DAY

All years are 365 days long.

NOLEAP or 365_DAY

All years are 365 days long.

ALL_LEAP or 366_DAY

All years are 366 days long.

360_DAY

All years are 360 days divided into 30 day months.

JULIAN

Julian calendar; leap years with no adjustment at the turn of the century.

These calendars are compatible with the Udunits standard which has slightly different naming conventions, except that the gregorian or standard calendar is a proleptic Gregorian calendar in Ferret. If the mixed Julian/Gregorian calendar is desired, use a time origin of 1-jan-0001:00:00 and Ferret will apply the 2-day shift that was made historically when the Gregorian calendar was introduced. The Udunits standard can be found at:

http://www.unidata.ucar.edu/software/udunits/udunits-1/udunits.txt

udunits.dat (A local copy of the above link).

The netCDF conventions recommend that the calendar be specified by the attribute time:calendar which is assigned to the time coordinate variable when there is a non-Gregorian calendar associated with a data set, i.e.

time:calendar=noleap

Ferret reads this attribute when it is present in a netCDF file and assigns the appropriate calendar identifer to the variable. When a variable has a non-Gregorian calendar, the attribute is written to a netCDF file when the variable is output to a netCDF file.



2.2.7 NetCDF "packed data"

The netCDF conventions documents describe the use of scale_factor and add_offset attributes for packing data. When Ferret encounters a variable in a netCDF file that has these attributes, it automatically applies them as the data is read, returning the scaled data. For details, please see the description of the scale_factor and add_offset attributes in the COARDS standards document at http://ferret.pmel.noaa.gov/noaa_coop/coop_cdf_profile.html



2.2.8 NetCDF version 4 

Unidata has released NetCDF-4 which includes classic NetCDF dataset access and also netcdf 4 files with HDF5 capabilities for compression and chunking. With  the release of netCDF-4.1, this also includes remote data access with build-in OPeNDAP client.  Starting with Ferret v6.6, Ferret is linked with netCDF-4.1 libraries instead of the libraries from opendap.org.  Please see the full netCDF documentation and descriptions of the use of HDF5 with netCDF-4 at the Unidata NetCDF pages,

http://www.unidata.ucar.edu/software/netcdf/

In Ferret v6.6, the Ferret syntax is expanded to take advantage of netCDF-4.  All previously existing syntax works with no changes needed.  Data that Ferret writes is compatible with netcdf-3 and netcdf-4 libraries by default; special netcdf-4 capabilities such as compression are employed only if settings are made to request those features.

A Ferret symbol is set at startup: NETCDF_VERSION
This queries the version number of the netCDF library, and reports back a string containing that version - the start of the string lists the version number.

SHOW SYMBOL NETCDF_VERSION

yes? SHOW SYMBOL NETCDF_VERSION
NETCDF_VERSION = "4.1 of Feb  5 2010 16:32:49

Netcdf-4 - formatted files are read by Ferret with no intervention from the user, except possibly to adjust the chunk cache for better performance.

SHOW NCCACHE    Lists the current settings for the CACHE
Chunk cache size is listed in mbytes. The current default set by the netCDF-4.1 library is 4.1943 Mbytes.

SET NCCACHE
to change the chunk cache settings from the defaults. The options are:

SET NCCACHE/SIZE/NELEMS/PREEMPT 
    /SIZE=   new size in mb
    /NELEMS  see the netCDF-4 documentation
    /PREEPMT see the netCDF-4 documentation

CANCEL NCCACHE  restores the default settings for the chunk cache

For example,

yes? SHOW NCCACHE
   Current NCDF Chunk Cache size 4.1943 MB, n_elems = 1009, preemption = 75

yes? SET NCCACHE/SIZ=16
yes? SHOW NCCACHE
   Current NCDF Chunk Cache size 16 MB, n_elems = 1009, preemption = 75

yes? CANCEL NCCACHE
   Restoring default chunk cache settings

yes? SHOW NCCACHE  Current NCDF Chunk Cache size 4.1943 MB, n_elems = 1009, preemption = 75

SAVE/NCFORMAT/DEFLATE/SHUFFLE/ENDIAN/XCHUNK/YCHUNK/ZCHUNK/TCHUNK

Qualifiers on LIST/FORMAT=CDF (e.g. SAVE). For more details see documentation on the LIST command in the Commands Reference section. These can be set differently for variables in a dataset.

/NCFORMAT = netcdf4, classic, 4, 3, 64BIT_OFFSET. (4=netcdf4, 3=classic). By default Ferret writes classic files.

/ENDIAN =   big, little, native

/DEFLATE = deflate level, 0 through 9; if specified with no argument,  the deflate level  is set to 1. (1 is recommended for compression)

/SHUFFLE = 0 or 1; if /shuffle with no argument, it is set to 1

/XCHUNK/YCHUNK/ZCHUNK/TCHUNK = chunk sizes in each index direction, integer values. If these are not specified, the compression will use the default chunk-size scheme of the NetCDF-4 library. If you specify a chunk size for any dimension, you must specify it for all of the dimensions present in the grid.  Chunking is done only if the file has been set to be written in NetCDF-4 format.

For example,

yes? USE ocean_atlas_subset
yes? ! save with default chunk sizes
yes? SAVE/FILE=deflate.nC/CLOBBER/NCFORMAT=4/DEFLATE/SHUFFLE temp

yes? ! or specify chunks
yes? SAVE/FILE=deflate.nc/CLOBBER/NCFORMAT=4/DEFLATE=1\ /XCHUNK=30/YCHUNK=30/ZCHUNK=1/TCHUNK=1 temp

The following qualifiers on SET LIST can be used to make settings for subsequent SAVE commands. More details may be found in the documentation for the command SET LIST in the Commands Reference section).  

SET LIST/NCFORMAT/ENDIAN/DEFLATE/SHUFFLE 

    /NCFORMAT = netcdf4, classic, 4, 3, 64BIT_OFFSET. (4=netcdf4, 3=classic)
    /ENDIAN   = big, little, native
    /DEFLATE  = deflate level, 0 through 9 (1 is recommended for compression)
    /SHUFFLE  = 0 or 1

For example,

yes? SET LIST/NCFORMMAT=4/DEFLATE=1  ! Will make these settings on SAVE commands


CANCEL LIST/all  Restores default settings

Netcdf-4 - formatted files are read by Ferret with no intervention from the user, except possibly to adjust the chunk cache for better performance.

SHOW NCCACHE    Lists the current settings for the CACHE

Chunk cache size is listed in mbytes. The current default set by the netCDF-4.1 library is 4.1943 Mbytes.

SET NCCACHE changes the chunk cache settings from the defaults

SET NCCACHE/SIZE/NELEMS/PREEMPT Sets the chunk cache:

    /SIZE=   new size in mb
    /NELEMS  see the netCDF-4 documentation
    /PREEPMT see the netCDF-4 documentation

CANCEL NCCACHE  restores the default settings for the chunk cache
For example,

yes? SHOW NCCACHE
Current NCDF Chunk Cache size 4.1943 MB, n_elems = 1009, preemption = 75 yes? SET NCCACHE/SIZ=16 yes? SHOW NCCACHE Current NCDF Chunk Cache size 16 MB, n_elems = 1009, preemption = 75 yes? CANCEL NCCACHE Restoring default chunk cache settings yes? SHO NCCACHE Current NCDF Chunk Cache size 4.1943 MB, n_elems = 1009, preemption = 75


There are qualifiers on SET LIST, to make settings for subsequent SAVE commands:

SET LIST/NCFORMAT/ENDIAN/DEFLATE/SHUFFLE 

      /NCFORMAT = netcdf4, classic, 4, 3, 64BIT_OFFSET. (4=netcdf4, 3=classic)
      /ENDIAN   = big, little, native
      /DEFLATE  = deflate level, 0 through 9 (1 is recommended for compression)
      /SHUFFLE  = 0 or 1

CANCEL LIST/all  Restores default settings


Powered by Plone CMS, the Open Source Content Management System

This site conforms to the following standards: