PyFerret and Ferret v7.6 implements automatic handling of datasets which use the Discrete Sampling Geometries standard for CF.
In Chapter 9 of the NetCDF CF Conventions document, a set of file types for "Discrete Sampling Geometries" is defined. These datasets describe point data, trajectories, profiles, time series, timeSeriesProfiles, and trajectoryProfiles. Ferret version 7.6 implements automatic handling of files that use the contiguous ragged array representation for these data types: point data, trajectories, profiles, time series. The handling of feature types timeSeriesProfiles, and trajectoryProfiles will be implemented in a later release.
The datasets are feature-collections, and a single feature is a single instance of a timeseries, a profile, a trajectory, or a point. The observations lie along the "observation" axis and the information describing features lie along the "instance" axis, which is of length nfeatures. In the documentation and example scripts we use the abbreviation "DSG files". A common source for DSG datasets is ERDDAP, where a file is saved from a tabledap dataset as data type .ncCF.
When a dataset contains the relevant attributes for a Discrete Geometries dataset, Ferret will assign the observations axis to the X direction for Trajectory data, the T direction for Timeseries data, the Z direction for Profile data, and the instance axis (which describe each feature) to the E direction. For Point data each point is an instance, so the observations lie in the E direction. The sample_dimension identifies a count variable, usually called Rowsize, listing the number of observations in each feature, and names the sample dimension to which it applies. Ferret locates the coordinate information, and defines a "translation grid" which allows it to map data to the world coordinates for the data type: stations in XYZ for Timeseries, stations in XYT for Profiles, trajectories along XYT and perhaps Z for Trajectories, and Points in ZYZT. Using the coordinate information, the ordinary region-selection commands apply, including qualifiers /X= /Y= /Z= /T= or SET REGION, or variable[x=, y=, z= ,t=]. Plots and listings are automatically labeled as always.
The variables which describe each feature are put onto the E axis.
Run the tutorial script
Here is a summary of the enhancements:
Command outputs customized to work with DSG datasets:
The output of SHOW DATA includes the DSG data type. The listing of the variables shows the feature-specific variables shown on an E axis of length number-of-features. The listing of observed variables shows the total length of the contiguous ragged array that stores the observations.
The output of SHOW GRID lists a nominal range on the "observations" axis, that represents the ragged-array storing all of the observations, and the feature-length axis of length number-of-features. Then it summarizes the coordinate ranges for the observations as found in the coordinate variables in the dataset.
A text listing to the terminal or a file, like any text listing, includes a header that describes the dataset and the subset requested, and then the data with their coordinates. Here the feature-id is added to the listing on each row, and the coordinates are all of the longitude-latitude-depth/height-time coordinates. The LIST command may qualifiers /X= /Y= /Z= /T= to limit the coordinate ranges shown, as well as /E= or /M= to choose a subset of features. Note that /I= /J= /K= and /L= cannot be used because index ranges don't map to coordinate ranges for these data types. ( To get a quick listing of a few data points, try "list/i=1:5 XSEQUENCE(variable)" )
See also feature-masking, to ask for a subset of features, e.g. a few of the profiles or timeseries.
yes? use dsg_profile_example yes? list/t="23-aug-2012:00:00":"23-aug-2012:06:00" sigma_t
To write a new DSG dataset which is a subset of an existing one, specify the variable and any coordinate ranges. The coordinates and the required elements that make up a Discrete Sampling Geometries file will automatically be written to the file. Further variables with the same constraints may be added to the file, however appending more features, or appending along the time axis is not at this time allowed. This command will write only the two profiles contained in this time range:
yes? use dsg_profile_example yes? save/file=new_file.nc/clobber/t="23-aug-2012:00:00":"23-aug-2012:06:00" sigma_t
The SET DATA/FEATURE= qualifier, or equivalently USE/FEATURE= changes the interpretation of the dataset to use a different feature type within the current session. In particular, Trajectory data may also be viewed as Timeseries data, with time increasing along each path. Reset a Trajectory datastet to look like a timeseries dataset with this setting:
yes? set data/feature=timeseries my_dataset.nc
If the dataset is already open, the dataset number may be used.
yes? set data/feature=timeseries 1
When a dataset is already open, any variables that have been loaded into memory are cleared, and if a feature-mask was defined, it no longer applies to the dataset. Setting data to another feature type applies only to changing from Trajectory to Timeseries data types. Other data types do not lend themselves to this logic.
Data of any feature-type may be treated as if it is not a Discrete-Sampling Geometries file, using /FEATURE=none
yes? set data/feature=none my_dataset.nc
and the dataset will appear to be data on simple 1-D grids: observations on a long 1-D axis and feature-level metadata on a shorter axis, number of stations etc. As above, if a dataset is already open, any variables that have been loaded into memory are cleared, and if a feature-mask was defined, it no longer applies to the dataset.
The new mode MODE DSG is set by default. If it is canceled, then in the current session datasets are NOT initialized as DSG datasets, but are handled as they were prior to Feret/PyFerret v7.6; as described in the italicized text at the end of this page.
SET DATA/FMASK= (or USE/FMASK=)
Apply a feature mask to the dataset. The feature mask is a variable with values 1 and 0 or 1 and missing, of length number-of-features. When applied to the dataset, only the features in the mask with value 1 will be used in listings, plots, and other operations.
yes? use dsg_trajectory_example yes? let pan_mask = if STRINDEX(expocode, "PAN") GT 0 then 1 yes? list/norow/nohead pan_mask ... ... 1.000 1.000
Apply this mask to the dataset. A plot or listing or other operation will now include only the two features included in the mask.
yes? set data/fmask=pan_mask dsg_trajectory_example
Transformations are applied separately to each feature. So transformations which summarize information about a variable, such as @MAX, @MIN, @SUM , or @NGD are applied separately to each feature return the maximum, minimum, sum, or number of good data per feature.
yes? use dsg_trajectory_example yes? list fco2_recommended[x=@max]
Transformations such as smoothers or indefinite integrals do their operation within features as well. So a smoothing transform does not smear data from one trajectory or one profile to the next in the file. For temperature profiles, for instance, the @SBX boxcar smoother stops at the end of each profile.
yes? use dsg_profile_example yes? list/m=1:2 temp, temp[z=@sbx]
Function calls have not yet been optimized for Discrete Sampling Geometries. A NOTE is issued to this effect. For functions which are not grid-changing functions, this is unimportant. For instance, we could use RHO_UN to compute mass density of seawater from temperature and salinity.
A set of timeseries or profiles in a DSG dataset may or may not have a common set of time coordinates or depth levels, respectively. They can be regridded to a common time or z axis using an ordinary regridding operation. The result is a 2-D variable, station-vs-time, or station-vs-depth.
yes? use dsg_timeseries_example yes? define axis/t="15-jan-2017:12:00":21-apr-2017:10/unit=days tuniform yes? let/like=t_25 t25_station_vs_time = t_25[gt=tuniform] yes? shade t25_station_vs_time
Comparisons with other datasets:
A regridding operation can be used to define a variable sampled from a gridded dataset such as a reference dataset or model output. THe data is sampled at the times and locations of the observations in a DSG dataset. Here is the sequence of operations:
yes? use gridded_data_set.nc yes? use dsg_timeseries_data.nc yes? let gridded_on_dsg = gridded_var[d=1, g=dsg_var[d=2] ] yes? let difference_on_dsg = dsg_var[d=2] - gridded_on_dsg yes? plot difference_on_dsg
yes? show functions scat2grid*
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
If using older Ferret/PyFerret versions, here are some methods available for working with files such as this:
Functions that are useful for discrete Geometries files include SEPARATE, to put a missing-value between the trajectories. Then they may be plotted as individual lines, stopping the line between different trajectories:
yes? let/units=degrees_east/title=longitude separate_lon = SEPARATE(longitude, rowsize, 1) yes? let/units=degrees_north/title=latitude separate_lat = SEPARATE(latitude, rowsize, 0) yes? let/units="`temp,return=units`"/title="`temp,return=title`" separate_temp = SEPARATE(temp, rowsize, 0) yes? ribbon/vs/line/thick/key/palette=rnb2 separate_lon, separate_lat, separate_temp
To choose a subset of trajectories, you can use one of the EXPNDI_BY_* functions to put the data onto an observation-by-feature axis, for instance, a temperature-by-trajectory grid, or for profile data (see the example below), salinity-by-profile. Here we use EXPNDI_BY_M_COUNTS.
yes? let longest = `rowsize[m=@max]` yes? let lon2d = EXPNDI_BY_M_COUNTS(longitude, rowsize, longest) yes? let lat2d = EXPNDI_BY_M_COUNTS(latitude, rowsize, longest) yes? let var2d = EXPNDI_BY_M_COUNTS(temp, rowsize, longest)
Say the trajectory names include the year of the observations. To pick out the ones deployed in 2010:
yes? let mask = if STRINDEX(trajectory, "2010") gt 0 then 1 yes? list mask VARIABLE : IF STRINDEX(TRAJECTORY, "2010") GT 0 THEN 1 DATA SET : SOCAT v3 data collection FILENAME : dsg_file.nc FILEPATH : /home/users/files/ SUBSET : 16 points (E) 1 / 1: .... 2 / 2: .... 3 / 3: .... 4 / 4: 1.000 5 / 5: 1.000 6 / 6: 1.000 7 / 7: 1.000 8 / 8: 1.000 9 / 9: .... 10 / 10: .... 11 / 11: .... 12 / 12: .... 13 / 13: .... 14 / 14: .... 15 / 15: .... 16 / 16: ....
Now, choose just those trajectories. The masked variables are 2-d variables, but it is fine to send them as arguments to the PLOT/VS command.
yes? let mask2d = EXPNDI_BY_M_COUNTS(mask, rowsize, longest) yes? let/units=degrees_east masked_lon = mask2d*lon2d yes? let/units=degrees_north masked_lat = mask2d*lat2d yes? let/units="`temp,return=units`"/title="`temp,return=title`" masked_temp = mask2d*var2d yes? plot/vs/ribbon/nolab masked_lon, masked_lat, masked_temp
Profile datasets list the longitudes and latitudes as "metadata" variables, that is one longitude/ latitude variable per profile.
yes? use my_profile_data.nc yes? show data currently SET data sets: 1> ./my_profile_data.nc (default) name title I J K L M N PLATFORM_CODE PLATFORM CODE ... ... ... ... 1:74 ... LONGITUDE Longitude ... ... ... ... 1:74 ... LATITUDE Latitude ... ... ... ... 1:74 ... ROWSIZE Number of Observations for this ... ... ... ... 1:74 ... TIME OBSERVATION DATE 1:43935 ... ... ... ... ... DEPTH OBSERVATION DEPTH 1:43935 ... ... ... ... ... ZSAL Sea Water Salinity 1:43935 ... ... ... ... ...
To plot, say, lines showing the salinity at depth, as a function of the longitudes represented in the data, we need to replicate the longitude values, so that our new longitude variable has the longitude for profile 1 corresponding to each observation of profile 1, the longitude for profile 2 corresponding to each observation of that profile, etc. Use the EXPND_BY_LEN function here.
yes? let nx = `rowsize[m=@sum]` ! or equivalently, we could use the size of zsal. yes? let lon_obs = EXPND_BY_LEN(longitude,rowsize,nx) yes? let/units=degrees_east/title=longitude separate_lon = SEPARATE(lon_obs, rowsize, 1) yes? let/units=`depth,return=units`/title=depth separate_dep = SEPARATE(depth, rowsize, 0) yes? let/units="`zsal,return=units`"/title="`zsal,return=title`" separate_sal = SEPARATE(zsal, rowsize, 0) ! The /vlimits qualifier causes Ferret to draw the z axis as a depth axis on the plot: yes? ribbon/vs/line/thick/key/lev=v/VLIMITS=2000:0 separate_lon, separate_dep , separate_sal