How do I get my ascii/binary/etc. gridded data into LAS?
Unfortunately, there seem to be as many data formats as there are investigators collecting data. Many institutions and many individual investigators have chosen to store their data in one of a handful of widely used data formats. Many others have not. Even when a widely accepted data format is specified, there are always many ways to abuse it often involving nonstandard use of metadata.
The goal of LAS is to present a uniform interface to data while minimizing the work of the person setting up LAS. It thus behooves the LAS manager to adopt some uniform standards for data to be presented in LAS. This FAQ outlines some basic standards that will make ingesting data into LAS as easy as can reasonably be expected.
Note on ASCII files:
Although LAS can be made to read ASCII or "flat" binary data files
as-is, we recommend that you do not take this approach. Instead we recommend
that you convert your ASCII and binary data files to netCDF. Files which are
not in netCDF are not readable with random access techniques; that means that
LAS has to read the entire file, even when the user has requested only a small
fraction of the data. This has obvious negative impacts on the performance of
LAS. Also by converting to netCDF the automated configuration tools in LAS can
be brought to bear. In most cases the Ferret program, which is part of LAS,
can be used to create netCDF versions of your data files.
LAS has a script (addXml.pl) which can automatically create an XML configuration file from your data if your data is in netCDF format and adheres to the COARDS conventions:
netCDF
- netCDF is a widely used, self describing, machine independent format that allows for array oriented data subsetting.
- It is both free and well supported by Unidata.
- Interfaces are available for C, Fortran, C++, Java, perl, Matlab and Python.
conventions
- In order for data to be interchangeable it must adhere to a set of conventions for using the netCDF format.
- We encourage investigators to follow the standards set forth in the COARDS conventions.
- Specific extensions to COARDS appropriate for climate and forecast models are given in the CF conventions.
- Adherence to these three layers of standardization will greatly ease the dissemination and reuse of data.
- These recommendations have already been adopted by the OCMIP group.
After creating a netCDF version of your data you should check the following:
Here are some of the common issues which require special treatment:
Before attempting to use addXml.pl it is worth checking your data with the netCDF utility ncdump:
> ncdump -h ~file~
Doing some simple checks with Ferret is also recommended.
It is always best to create 'correct' metadata within your files but you can fix many problems with initialization scripts or configuration options.
Here is an example of what LAS considers adequate metadata.> ncdump -h SeaWiFS_data.nc netcdf SeaWiFS_data { dimensions: LONGITUDE = 360 ; LATITUDE = 180 ; TIME = 138 ; variables: float LONGITUDE(LONGITUDE) ; LONGITUDE:units = "degrees_east" ; LONGITUDE:long_name = "Longitude" ; LONGITUDE:point_spacing = "even" ; float LATITUDE(LATITUDE) ; LATITUDE:units = "degrees_north" ; LATITUDE:long_name = "Latitude" ; LATITUDE:point_spacing = "even" ; double TIME(TIME) ; TIME:units = "days since 1950-01-01 00:00:00" ; TIME:long_name = "Time" ; TIME:time_origin = "1-JAN-1950" ; TIME:point_spacing = "uneven" ; float CHLOROPHYLL(TIME, LATITUDE, LONGITUDE) ; CHLOROPHYLL:units = "mg/m^3" ; CHLOROPHYLL:long_name = "Chlorophyll" ; CHLOROPHYLL:missing_value = "1e+35" ; // global attributes: :Title = "Global SeaWiFS chlorophyll 8 day composites" ; :authors = "J. Yoder & M. Kennelly" ; :creation_date = "26 July, 2001" ; }
Jonathan Callahan: Jonathan.S.Callahan@noaa.gov
Last modified: July 19, 2002