DataServices/TDS_Config

Configuring THREDDS Data Server

Services

TDS has a number of services that you can enable. At a minimum, you should enable the OPeNDAP service. This includes OPeNDAP, WCS, WMS, NetCDFSubset and more. The TDS catalog can also adverstise other protocols that the file can be retrieved by, such as GridFTP. Configuration files are located in $JETTY_HOME/content/thredds from version 4.0 onwards. This will be referred to as $TDS_CONFIG_DIR from now on.

There's two main files you need to modify: threddsConfig.xml (the server config) and catalog.xml (dataset catalog config)

Configuring TDS

TDS has a few services that are turned off by default. You can enable them by modifying the $TDS_CONFIG_DIR/threddsConfig.xml file. Make sure that the <allow> tag is set to true (these are false by default). E.g.:

  <!--
  The Netcdf Subset Service is off by default.
  -->
  <NetcdfSubsetService>
    '''<allow>true</allow>'''
    <dir>/tmp/thredds/ncSubsetCache/</dir>
    <scour>10 min</scour>
    <maxAge>-1 min</maxAge>
  </NetcdfSubsetService>

  <!--
  The WCS Service is off by default.
  Also, off by default (and encouraged) is operating on a remote dataset.
  -->
  <WCS>
    <allow>true</allow>
    <allowRemote>true</allowRemote>
    <dir>/tmp/thredds/wcsCache/</dir>
    <scour>15 min</scour>
    <maxAge>30 min</maxAge>
  </WCS>

 <WMS>
    <allow>true</allow>
  </WMS>

TDS use caches to store data temporarily for the services enabled in the above config. It's a good idea to enable them, but make sure you specify an area that has enough space!

You can also modify the HTML view of the catalogs, such as the logo on the top left of the catalog views by editing the <htmlSetup>. E.g.:

    <!--
     * The URL and alternate text for this TDS installations logo.
     * -->
    <installName>OPeNDAP</installName>
    <installLogoUrl>arcs_logo.png</installLogoUrl>
    <installLogoAlt></installLogoAlt>

    <!--
     * Information for the institution hosting this TDS installation:
     * - a URL for the institution;
     * - a URL to the institution logo; and
     * - alternate text for the institution logo.
     * -->
    <hostInstName>ARCS</hostInstName>
    <hostInstUrl>http://www.arcs.org.au</hostInstUrl>
    <hostInstLogoUrl></hostInstLogoUrl>
    <hostInstLogoAlt></hostInstLogoAlt>
  </htmlSetup>

Note that the image URL is relative to the TDS webapp directory, i.e.: /opt/jetty-6.1.15/webapps/thredds/

Enabling cache

There's uses a few caches that will enable faster access to files. Please make sure that the cache directories are created and have write permission by the jetty user! Since each system is different, you should consult your system administrator for an ideal location. The following are recommended:

  <!--
  CDM uses the DiskCache directory to store temporary files, like uncompressed files.
  -->
  <DiskCache>
    <alwaysUse>true</alwaysUse>
    <dir>/data/tmp/thredds/cache/</dir>
    <scour>1 hour</scour>
    <maxSize>10 Gb</maxSize>
  </DiskCache>

  <!--
  Caching open NetcdfFile objects.
  default is to allow 200 - 400 open files, cleanup every 10 minutes
  -->
  <NetcdfFileCache>
    <minFiles>200</minFiles>
    <maxFiles>400</maxFiles>
    <scour>10 min</scour>
  </NetcdfFileCache>

  <!--
  Caching open NetcdfDataset objects.
   default allow 100 - 200 open datasets, cleanup every 10 minutes
  -->
  <NetcdfDatasetCache>
    <minFiles>100</minFiles>
    <maxFiles>200</maxFiles>
    <scour>10 min</scour>
  </NetcdfDatasetCache>

Configuring Datasets

To define services, you will need to edit the configuration catalog.xml file - $TDS_CONFIG_DIR/catalog.xml. If you a serving gridded datasets, it is recommended that you use the following services:

  <service name="allServices" base="" serviceType="compound">
    <service name="dapService" serviceType="OpenDAP" base="/thredds/dodsC/" />
    <service name="httpService" serviceType="HTTPServer" base="/thredds/fileServer/" /> <!-- direct file download --> 
    <service name="wcsService" serviceType="WCS" base="/thredds/wcs/" /> <!-- OGC Web Coverage Service -->
    <service name="wmsService" serviceType="WMS" base="/thredds/wms/" /> <!-- OGC Web Map Service -->
    <service name="ncss" serviceType="NetcdfSubset (Experimental)" base="/thredds/ncss/grid/" /> <!-- NetCDF Subset service -->
    <service name="GridFTP" serviceType="GridFTP" base="gsiftp://gridftp-whiteout.tpac.org.au/library/" />
  </service>

For services, the resulting URL is concatenated like so:

service.base + access.urlPath + service.suffix

Where the <dataset path> is the path element in the dataset definitions. So using the service settings above, there will be 6 URLs for each dataset - e.g.:

   1. OPENDAP: /thredds/dodsC/testAll/2004050400_eta_211.nc
   2. HTTPServer: /thredds/fileServer/testAll/2004050400_eta_211.nc
   3. WCS: /thredds/wcs/testAll/2004050400_eta_211.nc
   4. WMS: /thredds/wms/testAll/2004050400_eta_211.nc
   5. NetcdfSubset: /thredds/ncss/grid/testAll/2004050400_eta_211.nc
   6. GridFTP: gsiftp://gridftp-whiteout.tpac.org.au/library/testAll/2004050400_eta_211.nc

However, not all services works on all file types. WCS is limited to regularly gridded datasets, while WMS works on gridded datasets of both regular and the irregular variety. Here's a list of sensible configuration for each file type:

<?xml version="1.0" encoding="UTF-8"?>
<catalog name="TPAC/ARCS OPeNDAP server"
        xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0"
        xmlns:xlink="http://www.w3.org/1999/xlink">

  <!-- Regularly Gridded Datasets -->
  <service name="regGriddedServices" base="" serviceType="compound">
    <service name="dapService" serviceType="OpenDAP" base="/thredds/dodsC/" />
    <service name="httpService" serviceType="HTTPServer" base="/thredds/fileServer/" /> <!-- direct file download -->
    <service name="wcsService" serviceType="WCS" base="/thredds/wcs/" /> <!-- OGC Web Coverage Service -->
    <service name="wmsService" serviceType="WMS" base="/thredds/wms/" /> <!-- OGC Web Map Service -->
    <service name="ncss" serviceType="NetcdfSubset (Experimental)" base="/thredds/ncss/grid/" /> <!-- NetCDF Subset service -->
    <service name="GridFTP" serviceType="GridFTP" base="gsiftp://arcs-df.tpac.org.au/library/" />
   </service>

   <!-- Irregularly Gridded Dataset -->
   <service name="irregGriddedServices" base="" serviceType="compound">    <service name="dapService" serviceType="OpenDAP" base="/thredds/dodsC/" />    <service name="httpService" serviceType="HTTPServer" base="/thredds/fileServer/" /> <!-- direct file download -->
    <service name="wmsService" serviceType="WMS" base="/thredds/wms/" /> <!-- OGC Web Map Service -->
    <service name="ncss" serviceType="NetcdfSubset (Experimental)" base="/thredds/ncss/grid/" /> <!-- NetCDF Subset service -->
    <service name="GridFTP" serviceType="GridFTP" base="gsiftp://gridftp-whiteout.tpac.org.au/library/" />   </service>

   <!-- Station Datasets -->
   <service name="station" base="" serviceType="compound">
    <service name="dapService" serviceType="OpenDAP" base="/thredds/dodsC/" />
    <service name="httpService" serviceType="Driect Download" base="/thredds/fileServer/" /> <!-- direct file download -->
    <service name="GridFTP" serviceType="GridFTP" base="gsiftp://arcs-df.tpac.org.au/library/" />
   </service>

   <!-- Trajectory Datasets -->
   <service name="trajectory" base="" serviceType="compound">
    <service name="dapService" serviceType="OpenDAP" base="/thredds/dodsC/" />
    <service name="httpService" serviceType="HTTPServer" base="/thredds/fileServer/" /> <!-- direct file download -->
    <service name="GridFTP" serviceType="GridFTP" base="gsiftp://arcs-df.tpac.org.au/library/" />
   </service>


    <!-- Files that are not served by OPeNDAP, just plain HTTP.  E.g. Matlab Scripts -->
    <service name="httpOnly" serviceType="HTTPServer" base="/thredds/fileServer/" />

Configuring Datasets

This section is based on  http://www.unidata.ucar.edu/projects/THREDDS/tech/tutorial/TDSConfiguration.html

Datasets are also specified in the THREDDS configuration catalog file $TDS_CONFIG_DIR/catalog.xml. The default catalog would contain a few datasets.

Datasets are logical endpoints of data access. There are generally 3 ways to setup datasets.

  1. Directly pointing to individual files
  2. Scanning a directory for files
  3. Aggregating files together and making multiple files appear like a single dataset

Pointing to individual files

When listing individual files, you will have to specify a !datasetRoot element, e.g.:

<datasetRoot path="test" location="content/testdata/" />

Where the datasetRoot's path attribute points to the relative path that will be shown in the browser. In this case, any datasets associated with this datasetRoot would reside in  http://host/testdata and the physical files will be relative to the location attribute. In this particular instance, it's pointing to $TDS_CONFIG_DIR/public/testdata (/opt/jetty-6.1.15.rc4/content/thredds/public/testdata).

To add a file to the catalog, you will need to specify the following:

<dataset name="Test Single Dataset" ID="testDataset" serviceName="allServices"
           urlPath="test/testData.nc" />

Where:

  • name: is the human readable name
  • ID: machine readable ID for this dataset
  • serviceName: the name attribute of a service element
  • urlPath: this is where you link the dataset back to the datasetRoot element. This is the access.urlPath part of a service URL.

You can also group multiple datasets together, by nesting them in a dataset element:

<dataset name="Test Group">
    <dataset name="myGroup">
        <dataset name="Test Single Dataset" ID="testDataset" serviceName="allServices"
           urlPath="test/testData.nc" />
        <dataset name="Test Second Single Dataset" ID="testDataset" serviceName="allServices"
           urlPath="test/testData2.nc" />
    </dataset>
</dataset>

Pointing to directory of files

There is a shortcut for listing a list of files in a directory. The datasetScan element will scan a directory for files that matches its exclusion and inclusion filters. The datasetScan element will continue to apply the filters on subsequent directories.

For these datasets, you do not need to have a datasetRoot element. Instead, all you need to define is:

  <datasetScan name="CCAM_CFT_NCEP" ID="CCAM_CFP_NCEP" path="CCAM_CFT_NCEP" location="/usr/local/apache-tomcat/content/thredds/public/ccam_agg">
    <metadata inherited="true">
        <serviceName>allServices</serviceName>
        <authority>TPAC</authority>
             <dataType>Grid</dataType>
            <dataFormat>NetCDF</dataFormat>
    </metadata>
    <filter>
        <exclude wildcard="*.ncml"/>
        <include wildcard="*.nc" />
    </filter>
</datasetScan>

Where:

  • metadata: refers to dataset metadata. Note that inherited is set to true. Any datasets within this dataset (and nested datasets) will also have the same metadata. This include the set of services that this dataset will be served by.
  • filter: includes and excludes based on a pattern

Changing default index page

By default, Jetty will be showing the test webapp as the default page of the server, i.e. the page when you access  http://<host>. To change that, you will need to remove the test app from the context and create a new root document directory.

As the jetty user:

cd /opt/jetty-6.1.15/contexts/
mv test.xml test.xml.old
cd /opt/jetty-6.1.15/webapps/
mkdir root
touch index.html

Modify the index.html document so then it will redirect the browser to the THREDDS default page, using something like that following:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<title>ARCS OPeNDAP @ QCIF</title>
<meta http-equiv="REFRESH" content="0;url=http://opendap-qcif.arcs.org.au/thredds/"></HEAD>
<BODY>
</BODY>
</HTML>