NetDRMS - a shared data management system

Introduction

Background — About DRMS

In order to process, archive, and distribute the substantial quantity of data flowing from the Atmospheric Imaging Assembly (AIA) and Helioseismic and Magnetic Imager (HMI) instruments on the Solar Dynamics Observatory (SDO), the Joint Science Operations Center (JSOC) has developed its own data management system. This system, the Data Record Management System (DRMS), is based on the organization of the data archive into a collection of data series, each consisting of a number of similarly organized data records. Each data record consists of a set of keyword-value metadata and (optionally) one or more segments of array data. All records in a given data series are described with the same set of metadata keywords. The layout of the segment arrays for a given data series is identical from record to record, though not necessarily identical among the segments for a data series in which each record may contain multiple segments.

The DRMS uses a relational database for storing the metadata associated with each data record, as well as the series themselves. A data series corresponds to a table in the database: each record in the series corresponds to a row in the table, each metadata keyword for the series to a column.

The data segments are stored in a virtual file system based on a combination of cache disk space and archival tape, with unique identifiers forming part of the record metadata in DRMS. A data segment is therfore accessed through its record identifier in the DRMS, rather than directly by a fixed directory location. Its tape location is of course unique, but its disk directory path is arbitrary, hence the virtual nature of the data system. The management of the tape and disk cache is handled by the Storage Unit Management System (SUMS). A DRMS server must be able to communicate with a SUMS server in order to retrieve or store data segments. The SUMS server need not be local, however. It is possible for multiple DRMS servers to communicate with a single SUMS server.

In future it may be possible for a DRMS server to run without direct communication to any SUMS server at all, retrieving (but not storing) data by network communication with other DRMS servers. This is not yet implemented, however.

For more information see the wiki pages at jsoc.stanford.edu.

DRMS is currently running in at least ten locations, and public site code identifiers have been reserved for several additional sites that have expressed interest in installing current or future versions. A list of all such sites can be found on the site information page.

Installing NetDRMS

There are two parts to setting up NetDRMS. First, the necessary services must be set up at the institution or group that will be hosting the NetDRMS service. The basic preparation and installation only needs to be done once, although the actual software distribution may be updated from time to time without affecting the setup. Second, individual users may wish to set up the NetDRMS software distribution for use or development in their own environment. Again, there are a few administrative tasks that need to be performed once when a user is registered, but the software may be updated or rebuilt at any time. Once the site preparation and setup is complete, user setup is a simple task, so there are two sets of instructions. Most users only need to concern themselves with the second, Installing / Upgrading NetDRMS.

  1. Add users as appropriate to the server database; following the User Setup instructions.
  2. Run sample test modules.
  3. For additional instructions on operations, maintenance, replication, and updating of databses, see the JSOC wiki instructions for now. There is also a gradual “cookbook” tutorial with self-contained make instructions for would-be DRMS module writers in the DRMS release under proj/cookbook.

Valid HTML 4.01 Strict 2 Sep 2010, 17:08-0700