SUMS Archive and Retention Policies

SUMS manages the JSOC data according to directives provided by DRMS (or other user system such as the capture system). The directives that SUMS uses to allocate storage, tapes, and online retention time are:

Presumably the Permanent bit implies an infinite retention time. Jim - we need to describe the relation between the flag bits and retention time

Within the JSOC some data series will be archived to tape and kept online for only 30-60 days, some will be kept online forever (or as close to it as we care) but will not be archived to tape, and some data series will be kept online for only a few days or weeks and not archived. Some examples are:

Archive

Retention

Example

0

5

Data series generated by a user developing new analysis tools

0

30

Export data may be kept for a while online to allow a remote user to access it

0

perm

HMI Level-1 data will be online but not archived

1

60

AIA or HMI level-0 data will be archived to tape but online only a while

1

5

Capture system raw telemetry data should move to level-0 and tape quickly

1

varies

DRMS Session Logs are kept online for as long as the longest record made in that session

1

7

DRMS Session logs for sessions where a module fails to complete and the session is aborted


The Interaction between DRMS and SUMS

When a user creates a series, the .jsd specifies whether or not that series' storage units should be archived (saved to tape) or not. Likewise, it specifies a retention time (duration in days that storage units remain on disk). Every storage unit of the series that is allocated during any DRMS session is subject to these parameters. When the series is created, a record is added to the <ns>.drms_series table that contains several keywords, including 'archive' and 'retention'. These two keywords contain the corresponding values (integers) specified in the .jsd.

Normally, when a DRMS module runs, it loads the values of the archive and retention keywords from the drms_series table. Right before terminating, the module commits any storage units that were created during the DRMS session. It is at this time that DRMS communicates to SUMS the values of the archive and retention parameters. SUMS then (saves this information where?, does what?). However, there is a method for overriding the drms_series-table archive and retention values. The caller of the module can specify, on the command line, the "-A" flag. If this flag is set, then regardless of the value of the archive keyword in the drms_series table, new storage units will be archived. Similarly, if the module caller specifies DRMS_RETENTION=<numdays> on the command line, then regardless of the retention value in the drms_series table, new storage units will remain on disk for <numdays> days. These command-line-specified values go out of scope when the DRMS session ends, at which time the values in the drms_series tables are used.


notes from jim

/home/jim/cvs/jsoc/src/base/sum/doc/sum_storage_options.txt

Here are the data storage types that I understand DRMS handles.


Transient

Data records and data segments exist only during the active DRMS session. The DRMS deletes (i.e. does not commit?) all data records for the session. The DRMS does not do a sum_put() on any allocated storage units, but simply does a sum_close() to discard any allocated storage.


Temporary

Data records and data segments exits for at least a given duration. After this period they are subject to deletion, depending on free data storage settings. The storage unit of the log files for the session will have a retention duration of the highest retention duration of any storage unit allocated. The retention duration of a storage unit is determined by the sum_put() call parameters, where mode = TEMP+TOUCH and tdays=#of 24hr periods to retain starting now. The tdays default is 0. A tdays=-1 will in effect make a permanent storage unit on disk w/o it being archived to tape.

When SUM storage is required, expired storage units will be deleted. In the log file storage unit is a file (TBD - get name) that DRMS has created that lists the record numbers of the DRMS records to also now be deleted.


Archivable

Data segments are written to tape and then marked deletable and are subject to be deleted after their retention time has expired. The storage units are called with sum_put(), where mode = ARCH+TOUCH and tdays=retention days. The data records created by the session are permanent in DRMS.


Permanent

This is the same as archivable except the retention time is ignored and the storage units will never be removed from disk. This occurs by a sum_put(), where mode=PERM.

JsocWiki: SumsArchiveTimes (last edited 2013-05-01 04:35:23 by localhost)