Describe DraftJsocDevelopersGuide here.
Tim, this one's for you.
Welcome to the Wiki for JSOC software development
Contents
For general information including the basic setup to get started see the Users Guide.
News
JSOC 4.7 released -- Oct 16, 2008 See release notes
JSOC 4.6 released -- Sep 03, 2008 See release notes
JSOC 4.5 released -- Jul 16, 2008 See release notes
JSOC 4.4 released -- Jun 09, 2008 See release notes
JSOC 4.3 released -- May 26, 2008 See release notes
JSOC 4.2 released -- Apr 08, 2008 See release notes
JSOC 4.1 released -- Mar 03, 2008 See release notes
JSOC 4.0 released -- Dec 07, 2007 See release notes
JSOC 3.8 released -- Sep 12, 2007 See release notes
JSOC 3.3 released -- Nov 03, 2006 See release notes
JSOC 3.2 released -- Sep 15, 2006 See release notes
JSOC 3.1 released -- Aug 24, 2006 See release notes
JSOC 3.0 released -- Aug 14, 2006. See release notes
JSOC 2.3 released -- Aug 10, 2006. See release notes
JSOC 2.2 released -- May 30, 2006. See release notes
JSOC 2.1 released -- May 24, 2006. See release notes
JSOC 2.0 released -- Feb 03, 2006. See release notes
JSOC 1.0 released -- Oct 19, 2005. See release notes
DRMS Data Series
DRMS (html) -- DRMS Overview(as pdf) - old doc, contents being verified and included in these pages.
DRMS Names Summary -- DRMS Dataset Names and Queries BNF summary
DRMS Dataset Names -- Full description (pdf)
DRMS Series Names -- Data Series Reserved Names
- Data Storage, Archive, On-Line Retention, EtC.
SUMS Data Storage -- SUMS Basic Data Storage Concepts
SUMS Archive and Retention Table -- Meaning and implications of the values
.jsd -- JSOC Series Definition Files
- Data series utilities to be run from a user's shell
create_series -- Create entries and tables for a new series in the DRMS database
describe_series -- Prints a verbose description of the
named series and its current highest record number on stdout.
delete_series -- Removes a series and all its associated entries from DRMS.
JSOC Sessions, Pipelines, and Modules (''Oh my!'')
JSOC programs that use DRMS to operate on DataSeries are called "modules". Modules are run in "sessions". HMI and AIA major processing tasks are accomplished in "pipelines" consisting of one or more sessions. Pipelines are started by "PUI" (Pipeline User Interface) usually by the JSOC production team. Pipelines may also be initiated by users requesting DataSets via the web or by team members running locally or remotely. A DataSet is a collection of records selected by a query. In essence a dataset name is simply the query that describes it.
A DRMS Session is the basic unit of computing that interracts with DRMS and SUMS. At the start of a session the user connects to the DRMS database. During the session the user runs one or more modules which read or create DataRecords in DataSeries. Access to the actual data stored in SUMS is accomplished within a module via the DRMS API. At the end of a session, SUMS is notified to save any new records online and/or on tape, or to delete records marked temporary to the session.
Actually using the JSOC DRMS requires running a program or module. By "program" we mean a normal shell command and by "module" we mean a program built to run within a DRMS session and communication to a drms_server. There are four types of programs/modules:
Modules - Most programs that do the work of the user of JSOC are what we call "modules". On the outside modules look like programs. They must run in a DRMS session. If they are built with the normal jsoc_main program they will use an existing session if they are run from a Session Provider or will start their own use-once session if they are called stand-alone from the shell.
Utility programs like create_series and describe_series which are usually used to manage the existence of dataseries, not to use dataseries. These programs talk directly to the database.
Session Providers like drms_run or later the Pipeline User Interface start DRMS sessions and execute a script file. They can also be used to execute a single instance of a module.
drms_server which connects connects to the database and serves sessions. Most users will not need to start drms_server explicitly.
The benefit of running programs as "modules" will hopefully become apparent when we start running complex pipelines using hundreds of processors.
Setting up Your Own DRMS
Information for developers outside the JSOC who wish to construct an independent data archive that can work in cooperation with the JSOC and other archives (or completely independently) can be found on the Setting up Your Own DRMS page.
General Information
DRMS Man Pages
All the JSOC "man" pages are now maintained with doxygen, a semi-automatic documentation tool. They are available in html form via:
All programs and functions should have entries in the doxygen generated pages but some do not yet. Old documentation still exists for these:
Note that all modules built with jsoc_main share a basic set of flags and command line keywords. See module.1 in man1.
We will once again soon have unix-style man pages at man1, but not yet:
Limits
There are limits ...
- memory limits on number of records in the cache (512Meg / (2.5*record size) ). While this may seem like a lot, for datasets with a lot of keywords (e.g. mdi_vw_V_06h) it can be a real limit to the number of records that can be open at a time. For the vw_V example it means that DRMS_QUERY_MEM should be set to at lest 2500 (yes 2.5 gig) to open 100 days of one-minute data. Modules expecting to need tens of thousands of records opened should arrange to do the work in blocks with drms_close_records used to empty the record cache to free memory.
- length of names of series, keywords, etc. (64 chars)
- length of comma sep list of prime key names (1024 chars)
- length of descriptions of series, keywords, links, and segments (254 chars)
- length of string values of keywords (dont know)
- number of keywords in a record (dont know)
- number of records in a series (no fixed limit)
- length of segment filename (255 chars)
- length of path (511 chars)
Log Files - Processing meta-data
There are log files. Stdout and Stderr are captured in files as well as shown during processing (depending on module and -v flag). These are all put into a SUMS directory and indexed in DRMS by session ID. The session ID is stored in each record so the log files can be retrieved if/when needed. Unless otherwise specified, the default retention time for log files is the maximum retention time of all SUs processed in the current session. The log files are archived if any one of the SUs in the current session is to be archived.
- drms_server logs --
- The default is no logging. When the logging option is turned on (-L), stdout and stderr are redirected to files in SU directory.
- module logs --
- The default is no logging. When the logging option is turned on (-L), stdout and stderr are tee-ed to files in SU directory.
Software Development - Building Modules
JSOC Software Tree
Software File Tree -- Organization of files and cvs tree.
GUI for JSOC CVS -- GUI access to DRMS and SUMS software
drms_types.h -- DRMS types and structure defintions.
Making a JSOC/DRMS Module
DRMS Module -- DRMS Module Structure and Overview
DRMS API -- DRMS Data Types and Structures and API
DRMS Module Compilation -- Running 'make' for modules
Making a JSOC/DRMS Library
JSOC LIbrary -- Creating and using a JSOC library
Notes on JSOC Makefile
Development Notes (old)
Database Administration
JSOC Backup and Restore
SUM API
JSOC Development Projects
DSDS-Data Access from JSOC
Exports
Remote DRMS/SUMS - netDRMS
- See Rick Bogart
( who might not remember that he has a collection of web pages on NetDRMS at http://jsoc.stanford.edu/netdrms/ )
JSOC Operator's Guide
Running Datacapture and lev0 Pipeline during SDI I&T