Overview of JSOC Data Series, DRMS, and SUMS

The JSOC data series

Data is stored in the HMI/AIA JSOC in "Data Series." A Data Series (or dataseries) is a basic sequence of like data objects, typically "images" or other binary data along with associated meta-data. A dataseries consists of a sequence of Data Records. Each datarecord is the data for one step in "time". Most but certainly not all dataseries are sequences in time. They can be in principle any list of data objects.

A datarecord consists of Keyword tagged meta-data describing the record and 0 or more named Datasegments usually containing binary arrays of data values. All datarecords in a given dataseries have the same set of keyword and datasegment names and associated record specific values. The dataseries description and the datarecords are maintained in a relational database called DRMS (Data Record Managment System). DRMS is implemented as a set of [http://www.postgresql.org/ PostgreSQL] tables.

While the DRMS record contains the description of each datasegment, the information contained a datasegment is not stored in the database but is stored in Storage Units "owned" by SUMS (Storage Unit Management System). Storageunits are simply directories. SUMS itself maintains tables in PostgreSQL to track storageunits locations on disk and/or tape. A storage unit may contain 1 or more datasegments for 1 or more datarecords.

In summary (click for more details)

Usually one or more keywords are designated prime keys. The prime keys must together uniquely identify a record and are sued to define the main index for the series. Any records with same sets of prime key values are assumed to be different versions of the same record. Thus the current version of any record in a given series may be found by specifying the values of the prime keys for that series. All series have one pre-defined keyword called "recnum" which is has a unique value for each record and is used for the main index in the case that no prime keys are defined.

In order to access a set of records from a series a description must be provided to select the desired records. We call that description a "Dataset Name". Thus, in JSOC/DRMS a dataset name is actually a database query. The DRMS dataset name rules have been defined to provide user friendly (well it is the goal) names that are easy to remember and use.

DRMS

SUMS

[wiki:SumsDataModel SUMS - the Storage Unit Management System]

Implementation

Older Documents

There are several older documents that while not accurate in describing the JSOC system as it is now implemented, do contain useful information about the design and intent and usage ideas. These are: