Differences between revisions 1 and 2
Revision 1 as of 2010-03-24 05:56:41
Size: 14202
Editor: arta-mbp
Comment:
Revision 2 as of 2010-03-24 05:58:42
Size: 14239
Editor: arta-mbp
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
Storage Unit Management Subsystem (SUMS) {{{

 
Storage Unit Management Subsystem (SUMS)
Line 8: Line 10:
#include
#include /* need to repeat this for pro-c precompiler */
#include <jsoc.h>
#include <stdint.h> /* need to repeat this for pro-c precompiler */
Line 294: Line 296:
/home/production/cvs/JSOC/base/sums/apps/main2.c /home/production/cvs/JSOC/base/sums/apps/main2.c</A>
Line 346: Line 348:

}}}

        Storage Unit Management Subsystem (SUMS)        
        ---------------------------------------

/* SUM.h */
#ifndef SUM_INCL
#define SUM_VERSION_NUM    (1.0)
#define DBCONNECT "DSOWNER/HMI4SDO@hmidb"
#include <jsoc.h>
#include <stdint.h>     /* need to repeat this for pro-c precompiler */

typedef uint64_t SUMID_t;

/* Bitmap of modes set in a SUM_t structure */
#define ARCH 1          /* archive the storage unit to tape */
#define TEMP 2          /* the storage unit is temporary */
#define PERM 4          /* the storage unit is permanent */
#define TOUCH 8         /* tdays gives the storage unit retention time */
#define RETRIEVE 16     /* retrieve from tape */
#define NORETRIEVE 32   /* don't retrieve from tape */
#define FULL 1024       /* also set this to get full info from DB query */

#define SUM_INCL
#endif

-------------------------------------------------------------------
This is found in sum_rpc.h:

typedef struct SUM_struct
{
  SUMID_t uid;
  CLIENT *cl;            /* client handle for calling sum_svc */
  SUM_info_t *sinfo;     /* info from sum_main for SUM_info() call */
  int debugflg;          /* verbose debug mode if set */
  int mode;              /* bit map of various modes */
  int tdays;             /* touch days for retention */
  int group;             /* group # for the given dataseries */
  int storeset;          /* assign storage from JSOC, DSDS, etc. Default JSOC*/
  int status;            /* return status on calls. 1 = error, 0 = success */
  double bytes;
  char *dsname;          /* dataseries name */
  char *username;        /* user's login name */
  char *history_comment; /* history comment string */
  int reqcnt;            /* # of entries in arrays below */
  uint64_t *dsix_ptr;    /* ptr to array of dsindex uint64_t */
  char **wd;             /* ptr to array of char * */
} SUM_t;

typedef struct SUMEXP_struct
{
  SUMID_t uid;
  int reqcnt;           /* # of entries in arrays below */
  char *host;           /* hostname target of scp call */
  char **src;           /* ptr to char * of source dirs */
  char **dest;          /* ptr to char * of destination dirs */
} SUMEXP_t;


--------------------------------------------------------------------------
SUM_t *SUM_open(char *server, char *db, int (*history)(const char *fmt, ...))

        A DRMS instance opens a session with SUMS. It gives the  server
        name to connect to, defaults to SUMSERVER env else SUMSERVER define.
        The db name has been depricated and has no effect. The db will be
        the one that sum_svc was started with, e.g. sum_svc hmidb.
        The history is a printf type logging function.
        Returns a pointer to a SUM handle that is
        used to identify this user for this session. 
        Returns NULL on failure.
        Currently the dsix_ptr[] and wd[] arrays are malloc'd to size
        SUMARRAYSZ (64).

--------------------------------------------------------------------------
int SUM_close(SUM_t *sum, int (*history)(const char *fmt, ...))

        Closes the session. Returns 0 on success, else error code. 
        Will release all
        read-only storage and release all uncommitted allocated storage and 
        free any other resources for this SUM handle. 

--------------------------------------------------------------------------
int SUM_get(SUM_t *sum, int (*history)(const char *fmt, ...))

        Gets the location of the storage units given by the dsindexes.
        Marks the storage units as open for read.
        Return 0 on success with data available, 1 on error, or
        RESULT_PEND when the data will come from tape (call SUM_poll()
        or SUM_wait() to get completion msg. NOTE: Caller must check
        sum->status for any errors after SUM_poll() or final SUM_wait()).
        NOTE: You can call to get any number of storage unit (reqcnt).
        One completion message will be received when all units are online.
        If you get back an error status, you will not know if any particular
        storage unit failed. All the reqcnt storage units stand or fall
        as a team. If you want resolution at the individual storage unit
        level, you should make seperate SUM_get() calls.
        If you make another SUM_get() call before you do a SUM_wait()
        there will be two completion messages pending and SUM_wait() will
        return after the first one and you will not know which one did
        complete and there will still be another one pending. So keep the
        SUM_get()/SUM_wait() calls paired, unless you want to explicitly
        program for something more complex.

The caller sets:
SUMID   = the open id
mode    = RETRIEVE | NORETRIEVE to get any offline dataunits from tape 
          storage or not. Also TOUCH if want to change online retention time.
tdays   = touch days for online retention. Always used regardless of TOUCH mode
          if the SU was read from tape.
reqcnt  = Number of dsindex values given below to get
dsix_ptr= Pointer to array of reqcnt uint64_t to indicate DB index of 
          the dataunits

The function returns:
Error code = 1, else 0 on success with:
  wd    = Array of char * pointing to the wd of each dsindex given. Value
          is empty string for any non-existing storage unit, or an offline
          storage unit when mode = NORETRIEVE.
else RESULT_PEND when result msg will be sent later.

--------------------------------------------------------------------------
int SUM_alloc(SUM_t *sum, int (*history)(const char *fmt, ...))

        Assigns storage from /SUM and does mkdir and reports wd. The dir
        is owned by the calling user. This is used when an application wants 
        to make datarecords and put them in the managed /SUM storage. 
        The application makes a SUM_alloc() call for each storage unit that
        it wants to output datarecords to.
        Also used internally to allocate storage for dataunits being retrieved 
        from tape.
        (An application can make multiple SUM_alloc() calls, while previously
        there was only one alloc for any pe map file. Note that there is no
        longer any subdir naming template as prog:, level:, and series: 
        no longer exits. Also the dsindex in now assinged at the start
        rather then at the end of storage unit (dataset) creation. )
        NOTE: Currently you are restricted to make only one alloc at a time,
        i.e. reqcnt must be 1.

The caller sets:
SUMID   = the open id
bytes   = number of bytes to allocate

The function returns:
Error code, else 0 on success with:
dsix_ptr= Pointer to dsindex assigned to this storage unit. The application
          associates this dsindex with every datarecord that it creates in
          this storage unit.
wd      = Pointer to string giving the allocated wd. It is of the form of
          /SUM2/D123456/ where D123456 is a unique number supplied by the DB
          for each SUM_alloc() call (acutally can be the dsindex). 
          The datasegment records are created under this wd by the application
          with file names of the form record_666.segment_001.fits. 
          Where 666 represents the unique record number assigned by the 
          JSOC Data Record Managment System and 001 represents the first of
          possible multiple datasegments written. (Check with Rasmus.)

--------------------------------------------------------------------------
int SUM_alloc2(SUM_t *sum, uint64_t sunum, int (*history)(const char *fmt, ...))

        Assigns storage from /SUM for the given sunum (i.e. ds_index)
        and does mkdir and reports wd. NOTE: This is designed to replicate
        locally, data from a remote SUMS. The sunum must not be from the
        range of assigned sunum's for the callers local SUMS. The sunum
        will first be validated as not belonging to this SUMS.
        The dir is owned by the calling user. 
        The application makes a SUM_alloc2() call for each storage unit that
        it wants to replicate data segments to.
        NOTE: Currently you are restricted to make only one alloc at a time,
        i.e. reqcnt must be 1.

The caller sets in sum:
SUMID   = the open id
bytes   = number of bytes to allocate

The function returns:
Error code, else 0 on success with:
dsix_ptr= Pointer to dsindex assigned to this storage unit. In this case, 
          it will be the given sunum.
wd      = Pointer to string giving the allocated wd. It is of the form of
          /SUM2/D123456/ where 123456 is the given sunum.

--------------------------------------------------------------------------
int SUM_put(SUM_t *sum, int (*history)(const char *fmt, ...))

        Puts storage units from allocated storage to the DB catalog.
        Upon success the wd is owned by production.
        NOTE: Currently you are restricted to make only one put at a time,
        i.e. reqcnt must be 1.

The caller sets:
SUMID   = the open id
mode    = [ARCH | TEMP | PERM] + TOUCH for a normal, temporary or permanent
          cataloging with touch option to give tdays below
tdays   = If TOUCH applies, number of days to retain the storage unit
dsname  = dataseries name
group   = the storage group # for this dataseries
reqcnt  = Number of dsindex values given below to put
dsix_ptr= Pointer to array of reqcnt uint64_t to indicate DB index of 
          the dataunits
wd      = Array of char * pointing to the wd of each dsindex given. Value
          is NULL for any missing dataset
        
--------------------------------------------------------------------------
int  SUM_poll(SUM_t *sum)

   Check if a previous request is complete.
 * Return 0 = msg complete, the sum has been updated
 * TIMEOUTMSG = msg still pending, try again later
 * ERRMSG = fatal error

NOTE: Upon msg complete return, sum->status != 0 if error anywhere in the
 path of the request that initially returned the RESULT_PEND status.

--------------------------------------------------------------------------
int  SUM_wait(SUM_t *sum)

   Wait until previous request is complete
 * Return 0 = msg complete, the sum has been updated
 * ERRMSG = fatal error

NOTE: Upon msg complete return, sum->status != 0 if error anywhere in the
 path of the request that initially returned the RESULT_PEND status.

--------------------------------------------------------------------------
int SUM_info(SUM_t *sum, uint64_t sunum, int (*history)(const char *fmt, ...))

        Returns the sum_main table info for the given sunum (i.e. ds_index)

Sample use:
  SUM_info_t *sinfo;

  if(SUM_info(sum, 2650355, printf)) {
    printf("Fail on SUM_info() in main3\n");
  }
  else {
    sinfo = sum->sinfo;
    printf("sum_info online_loc = %s\n", sinfo->online_loc);
    printf("sum_info online_status = %s\n", sinfo->online_status);
    printf("sum_info archive_status = %s\n", sinfo->archive_status);
    printf("sum_info creat_date = %s\n", sinfo->creat_date);
    printf("sum_info arch_tape = %s\n", sinfo->arch_tape);
    printf("sum_info arch_tape_fn = %d\n", sinfo->arch_tape_fn);
    printf("sum_info arch_tape_date = %s\n", sinfo->arch_tape_date);

  }

The function returns:
Error code, else 0 on success with sum->sinfo pointing to:
typedef struct SUM_info_struct
{
  uint64_t sunum;               //aka ds_index
  char online_loc[80];
  char online_status[5];
  char archive_status[5];
  char offsite_ack[5];
  char history_comment[80];
  char owning_series[80];
  int storage_group;
  double bytes;
  char creat_date[32];
  char username[10];
  char arch_tape[20];
  int arch_tape_fn;
  char arch_tape_date[32];
  char safe_tape[20];
  int safe_tape_fn;
  char safe_tape_date[32];
  int pa_status;
  int pa_substatus;
  char effective_date[20];
} SUM_info_t;


--------------------------------------------------------------------------
int SUM_delete_series(char *filename, int (*history)(const char *fmt, ...))

/* Called by the delete_series program before it deletes the series table.
 * Called with a pointer to a filename that has the sunums 
 * that are associated with the series about to be deleted.
 * Returns 1 on error, else 0.
*/
This will mark all the given storage units as delete pending with a 
substatus of DADPDELSU to not do any Records.txt processing for the
storage unit when it is deleted, as the DRMS may have reused the 
record numbers in the Records.txt file.

--------------------------------------------------------------------------
int SUM_export(SUMEXP_t *sumexp, int (*history)(const char *fmt, ...))

/* Will take a request (typically from remotesums_ingest)
 * and do an scp for the given host, source and target dirs.
 * The ssh-agent must be set up properly for this scp to complete.
 * Returns 0 on success, else 1.

Example of use is in:
/home/production/cvs/JSOC/base/sums/apps/main2.c</A>

--------------------------------------------------------------------------

EXAMPLE OF USE:

See cvsroot/PROTO/src/SUM/main.c

with corresponding make in Makelinuxia64.mk and Makelinux4.mk

The SUM API library is in:

cvsroot/PROTO/src/libSUMAPI.d (OLD)

and

cvs/jsoc/src/base/sumsapi

--------------------------------------------------------------------------


DISCUSSION:

The SUMS runs as a server sum_svc, which the SUM_open() connects with via
a socket. It can decide how to serialize calls to the DB.



--------------------------------------------------------------------------
20Jan2005

Here's how the SUMS might be used.

When the DRMS first starts it calls SUM_open().
When the DRMS gets a call to write a record it needs a wd to write
the record to. If none is assigned, then it calls SUM_alloc() to get a 
storage unit and it also gets back a dsindex to associate with this data record.

Subsequent records are written to this wd until the number of records are
in the storage unit, which the DRMS knows from the series definition. So when
the next record is to be written another SUM_alloc() must be done.

When the application finally returns to pe, it indicates all the storage
units that were allocated and that need a SUM_put() done on them.
This happens on the return to pe so that an abort and release of everything
can be an option that the module selects in the end.

All the storage units are put one by one with a SUM_put() call with the 
archive mode and retention time and other ancillary info.
This would allow a module to produce seperate archivable,
temporary and permanenent datasets, compared to mdi which requires that all
output ds have the same archive properties.

JsocWiki: SumsApi (last edited 2013-05-01 04:35:24 by localhost)