= DRMS Module =
== jsoc_main ==
A DRMS module is linked with the jsoc_main function which provides the C language "main()" program. If the module is called from within a DRMS session (i.e. there is a DRMSSESSION parameter on the command line or more usually in the environement) main establishes connections to the session server, deals with saving log files, calls the module code via an external variable called "DoIt", and closes the connection to the session server.
jsoc_main:
* Expects global variables:
* extern ModuleArgs_t module_args[]; /* Default module parameters */
* extern char *module_name; /* Module identifier string. */
* extern int DoIt(void); /* Module entry point. */
* Defines global variables:
* CmdParams_t cmdparams; /* returns parsed command line parameters */
* DRMS_Env_t *drms_env=NULL; /* DRMS Environment handle */
* Uses certein reserved command line parameters:
* -V (Verbose flag)
* -Q (Quiet flag)
* -L (Log flag)
* -H (Help flag)
* DRMS_QUERY_MEM
* DRMSSESSION
-V:: prints all the command line parmaters in cmdparams before execution
-Q:: turns off output to regular stdout and stderr; output will still be logged to the
DRMS session log directory
-L:: turns on saving stdout and stderr in DRMS session log directory
-H:: prints a usage message and exits (also, --help)
DRMS_QUERY_MEM:: maximum memory allocation for record queries in MB. (default 512)
DRMSSESSION :: an existing DRMS session ID, formed by combining the drms_server host
name with a port number. This parameter is required if and only if the
module is to connect to an existing DRMS server; if it is not provided,
the jsoc_main() program will start a private server for the module
The following optional parameters are applicable if and only if a private server is to be
started; see below for descriptions:
DRMS_LOG_RETENTION::
DRMS_SERVER_WAIT::
JSOC_DBHOST::
JSOC_DBUSER::
JSOC_DBPASSWD::
JSOC_DBDBNAME::
If jsoc_main finds a DRMSSESSION variable set it gets the DRMS session handle and prepares to start the module as part of that session. It first calls drms_open to connect to the session server by setting up connections to the session DB server and the SUMS server.
Then unless the "-L" flag is set it configures ''stdout'' and ''stderr'' to go to files in the session log directory. If the "-Q" quiet flag is set it simply redirects them to the log files. If the quiet flag is not set it "tees" them to the log files. Now it calls DoIt() to execute the module. On return from the module code if the return status is 0 all data records created by this module are inserted into the database and will become permanent at the next session commit. If return code is not zero all data inserted into the database since the last session commit are rolled back and the DRMS session is aborted.
If jsoc_main does not find the DRMSSESSION variable it starts up a private session and
executes the module with ommunication to its own drms server process.
In either case it reports the module return value back to the calling program, drms_run or other session manager.
The top-level file for implementing a module should look something like this:
{{{
------------------ module.c ------------------------
#include "drms.h"
#include "jsoc_main.h"
/* List of parameters */
ModuleArgs_t module_args[] = {
{ARG_type, "parm_name1", "default value1", "description (optional)",
"valid range (optional)"},
{ARG_type, "parm_name2", "default value2"},
{ARG_type "mandatory_parm3"},
/* ... more named parameter values ... */
{} /* List must end with null pointer; ARG_type ARG_END also works */
};
/* Module name presented to DRMS */
char *module_name = "this_module_name";
/* Module main function */
int DoIt () {
int status;
/* Do work...
...
Set status == 0 to indicate success.
Set status != 0 to indicate failure.
*/
return status;
}
----------------------------------------------------
}}}
== DRMS command line parsing ==
The functions associated with command line parsing are found in
jsoc/src/util/util/cmdparams.{c,h}.
The DRMS main program parses the module command line and stores
the information in a global data structure
CmdParams_t cmdparams;
that can be used to access the parameters from anywhere within the
module code, including library subroutines. The command line consists
of four types of tokens
* named parameters given in one of the forms "variable= value",
"variable=value" or "--variable value"
* single letter flags "-a -b -c" which can also be written in
concatenated form "-abc". Flags are translated into single-letter
named parameters with the value "1".
* unnamed argument strings of the form "value"
* command line files of the form "@filename". Each line in such a file
is parsed as an additional command line. Command files may contain
references to other command files. Blank lines or lines beginning
in "#" are treated as comment lines and ignored.
Command line files are a convenient mechanism to circumvent the
limitation on the number or length of command line arguments in
some shells.
{{{
-------- example -----------
Example: Assume that the file inputs.conf contains the three lines
# This is a test
input1.txt
input2.txt
then the command line
module.exe -vf test=debug abc.txt --log logfile def.bin @inputs.conf
will be parsed to have 4 named parameters
v = "1"
f = "1"
test = "debug"
log = "logfile"
and 4 unnamed arguments
abc.txt
def.bin
input1.txt
input2.txt
-------- end example ------
}}}
The values of the named parameters are read using the following
functions:
{{{
char *cmdparams_get_str(CmdParams_t *parms, char *name, int *status);
int8_t cmdparams_get_int8(CmdParams_t *parms, char *name, int *status);
int16_t cmdparams_get_int16(CmdParams_t *parms, char *name, int *status);
int32_t cmdparams_get_int32(CmdParams_t *parms, char *name, int *status);
int64_t cmdparams_get_int64(CmdParams_t *parms, char *name, int *status);
float cmdparams_get_float(CmdParams_t *parms, char *name, int *status);
double cmdparams_get_double(CmdParams_t *parms, char *name, int *status);
double cmdparams_get_time(CmdParams_t *parms, char *name, int *status);
int cmdparams_get_int(CmdParams_t *parms, char *name, int *status);
}}}
If the named parameter was not given on the command line
the functions above try to obtain their values from the environment
using the getenv function. Therefore the commands
module.exe blah="Hello"
and
setenv blah Hello<
>
module.exe
should have the same outcome.
The function
int cmdparams_exists(CmdParams_t *parms, char *name);
returns 1 if a named parameter matching the string in "name"
was given on the command line, and 0 if no such parameters was
given.
The (string) values of the unnamed arguments are read using the
following functions:
char *cmdparams_getarg(CmdParams_t *parms, int num);<
>
int cmdparams_numargs(CmdParams_t *parms);
cmdparams_getarg(cmdparms, 0);
returns the name of the running program (argv[0]).
In the cases where the return value is a string (char *) the string must
not be changed or "free-ed". It should be copied to working memory if
it needs to be modified. If "status" is NULL on call it will be ignored and
you will not be advised if the parameter lookup failed. Failure can occur if
an optional parameter is not found on the command line or environment or
if a number conversion fails.
Required parameters and default values can be given in the global struct
default_params that must be present in the module. The struct
takes the following form:
{{{
ModuleArgs_t
module_args[] = {
{parm1type, "parm_name1", "default value1", "description (optional)",},
"valid range (optional)"
{parm2type, "parm_name2", "default value2"},
{parm3type, "mandatory_parm3"},
/* ... more named parameter values ... */
{} /* List must end in null or ARG_END type */
};
}}}
If the value field in the struct for a given parameter is either
NULL or an empty string, it means that the parameter is mandatory and must be
present on the command line (or in an included parameters file, or in the
environment...). If not, an error message will be printed out and the module
terminated immediately after command line parsing.
The following argument types are recognized:
== DRMS data functions ==
The module read and writes data using the functions described in
jsoc/CM/*/drms_api.txt.
== Running a DRMS module ==
Running one or more DRMS modules involves three main steps
1. starting a DRMS session,
2. runnning the module(s) and
3. closing the session.
The final step will either commit all the data generated by
modules in the session or discard it if an error occured.
The script /jsoc/scripts/drms/drms_run automates the three steps
detailed below, and allows modules (or scripts containing multiple
module commands) to be run with a single command.
The command
{{{
host:~> drms_run ''command'' [options...]
}}}
will start a new DRMS server, run ''command'' and depending on the exit
status of ''command'' will either commit or discard changes to the
database and stop the DRMS server. drms_run will use the drms_server
executable pointed to by the environment variable DRMS_SERVER_EXE. If
DRMS_SERVER_EXE is not set drms_run will assume that an executable
"drms_server" is in your path. The output from the DRMS server is
piped to the file pointed to by the environment variable
DRMS_LOGFILE. If DRMS_SERVER_EXE is not set drms_run will create a log
file in /tmp/DRMS.''pid'', where ''pid'' is the PID of the drms_run
script interpreter.
The three steps are carried out as follows:
a. Before you run modules you must have a DRMS server running to
act as a session master. This can be done by running the command drms_server (see DrmsServerCmd).
{{{
host:~> jsoc/bin/<target>/drms_server -f
}}}
The server will print out what interface it is listening
for connections on. For example:
{{{
akhenaten:~/jsoc> bin/custom.akhenaten/drms_server -f
DRMS_HOST = akhenaten.Stanford.EDU
DRMS_PORT = 33137
DRMS_PID = 20955
DRMS_SESSIONID = 38
DRMS server started with pid=20955, noshare=0, noroe=0
...
}}}
The "-f" flag makes the server run in the foreground. Without "-f" the drms_server command spawn a server in a background process, prints the connection info to stdout (as above) and exits.
The server will print log messages to stdout and stderr (TBD: Clean up error handling and logging.), and these should be piped to a file if you intend to keep them.
b. Now you can run the module(s). The modules do not need to run on the same host as the server. They can run on any host as long as they are able to open a TCP socket connection to the server process.
When running a module, the named parameter DRMSSESSION must be set to indicate the host and port where the DRMS server is listening for connection attempts. It is perhaps most convenient to do this by setting the environment variable DRMSSESSION. In the example above this would mean executing the command:
{{{
akhenaten:~/jsoc> setenv DRMSSESSION akhenaten:33137
}}}
Each module that connects causes the server to spawn a new thread to service the new client. The server can service multiple clients simultaneously, but database operations are serialized within the server and executed sequentially using a shared connection to the DRMS database.
c. When all modules have finished successfully you can either<
>
a. tell the DRMS server stop and commit all data generated or modified by the modules to the DRMS database by sending a SIGUSR1 signal to it. In the example above that would mean issuing the command
{{{
akhenaten:~/jsoc> kill -s USR1 20955
}}}
or if an error occurs you can<
>
b) tell the DRMS server to abort and discard all data generated by the modules by sending it a SIGTERM, SIGQUIT or SIGINT. In the example above that could be done by pressing CTRL-C in the terminal where the server is running or by issuing the command
{{{
akhenaten:~/jsoc> kill -s INT 20955
}}}
It should be safe to kill the server with SIGKILL (kill -9). It will have the same effect as a regular abort except that it leaves a stale entry in DRMS's active session table.