(file) Return to ReadMe CVS log (file) (dir) Up to [Development] / JSOC / proj / cookbook

File: [Development] / JSOC / proj / cookbook / ReadMe (download)
Revision: 1.11, Thu Nov 3 23:57:19 2011 UTC (11 years, 7 months ago) by rick
Branch: MAIN
CVS Tags: Ver_LATEST, Ver_DRMSLATEST, Ver_9-5, Ver_9-41, Ver_9-4, Ver_9-3, Ver_9-2, Ver_9-1, Ver_9-0, Ver_8-8, Ver_8-7, Ver_8-6, Ver_8-5, Ver_8-4, Ver_8-3, Ver_8-2, Ver_8-12, Ver_8-11, Ver_8-10, Ver_8-1, Ver_8-0, Ver_7-1, Ver_7-0, Ver_6-4, Ver_6-3, Ver_6-2, Ver_6-1, Ver_6-0, NetDRMS_Ver_LATEST, NetDRMS_Ver_9-5, NetDRMS_Ver_9-41, NetDRMS_Ver_9-4, NetDRMS_Ver_9-3, NetDRMS_Ver_9-2, NetDRMS_Ver_9-1, NetDRMS_Ver_9-0, NetDRMS_Ver_8-8, NetDRMS_Ver_8-7, NetDRMS_Ver_8-6, NetDRMS_Ver_8-5, NetDRMS_Ver_8-4, NetDRMS_Ver_8-3, NetDRMS_Ver_8-2, NetDRMS_Ver_8-12, NetDRMS_Ver_8-11, NetDRMS_Ver_8-10, NetDRMS_Ver_8-1, NetDRMS_Ver_8-0, NetDRMS_Ver_7-1, NetDRMS_Ver_7-0, NetDRMS_Ver_6-4, NetDRMS_Ver_6-3, NetDRMS_Ver_6-2, NetDRMS_Ver_6-1, NetDRMS_Ver_6-0, HEAD
Changes since 1.10: +11 -0 lines
added instructions for building debug versions of modules

This directory contains a "cookbook" collection of sample DRMS modules,
along with a simplified make system that can be used to build comparable
applications modules outside of the NetDRMS/JSOC distribution trees.


Table of Contents

	Directories
CVS/		CVS repository information
Makevars/	files containing system/architecture-specific make variable
	definitions for use in a generic Makefile

	Regular Program
smpl_00.c	does (almost) nothing at all - like hello_world, but exhibits
	use of the command-line parsing used by JSOC module drivers

	DRMS Modules
smpl_01.c	same as smpl_00, but written as a module
smpl_02.c	echoes its arguments, of a wide variety of types
smpl_03.c	prints a list of the data series known to DRMS - a simple
	version of show_series
smpl_04.c	prints the number of unique records in a selected data series
	and the number of data segments per record - a simple version of
	show_info
smpl_05.c	inserts data records from noaa_ar.dat into series created with
	noaa_ar.jsd
smpl_06.c	updates data records in series built with smpl_05 with new
	keyword values
smpl_07.c	inserts image data in record segments in a data series created
	with images.jsd
smpl_08.c	calculates and reports primary statistics of record segments
	such as those generated with smpl_07

	Other
Makefile	The generic Makefile
ReadMe		(this file)
images.jsd	A JSD file that can be used to build a (keywordless) DRMS data
	series for storing data segments
noaa_ar.dat	A summary listing of data from daily NOAA Active Region reports
	for regions with spots from 1996 onward; can be used as input for a
	DRMS data series
noaa_ar.jsd	A JSD file that can be used to build a (segmentless) DRMS data
	series for the data from noaa_ar.dat

********************************************************************************

Recipe 0: How to write and build a program in this directory, using features
of the DRMS library

The program smpl_00.c is technically not a DRMS module, just a standard C
program similar to the familiar "Hello, world" one. The only real difference
is that it is linked against the DRMS API to use its command-line argument
parsing features. The message that the program prints when executed can be
changed from the default, "done", by running the command:

	smpl_00 print=something
or
	smpl_00 print= "Hello, world!"

(White space after the '=' sign is optional; quotes are necessary if the
argument string has embedded white space.)

The program can be built, like all of the modules in this directory as well,
by typing

	make smpl_00

(alternatively, you can simply type make and all the modules in the target
list MODS in the Makefile will be compiled.) The Makefile is generic: it is
designed to work on different hardware and operating system platforms by
accessing an appropriate set of make variables definitions from the various
files in the subdirectory makevars. To add a new module that does not require
linking with any additional libraries beyond those required for DRMS, all you
need to do (besides writing the module!) is to append its name (without the
filename extension) to the list of MODS in the Makefile.

Note that the Makefile requires the external definition of the environment
variable JPLAT to the appropriate name for your build platform. If, when you
run the make, you get a message like:

	Error: no appropriate Makevars; is $JPLAT defined? won't compile...

Either you have not defined the system-appropriate environment variable,
or there is no file Makevars_$JPLAT.mk in the Makevars directory. It should
be straightforward to create an appropriate one by replicating one of the
existing examples and substituting appropriate command paths.  Likewise,
if you get a message like:

	Makefile:1: Makevars/Makevars_XXX.mk: No such file or directory
	make: *** No rule to make target `Makevars/Makevars_XXX.mk'.  Stop.

it means that you have defined the environment variable JPLAT (as XXX), but
there is no corresponding Makevars_XXX.mk in the directory. Again, you can
create the appropriate one by replicating one of the existing examples and
substituting appropriate command paths.

(Alternatively, if you will only be building and running on one platform
ever, you could just add a platform definition line like:

	JPLAT = linux_x86_64

or whatever before the first line in the Makefile.)

The architecture-dependent Makevars_*.mk files define absolute paths to
essential commands, especially compilers. If you get messages like:

	make: /usr/local/bin/icc: Command not found

You must suitably redefine the relevant Make variables (e.g. CC or ICC in
this example) in the file being included. If you have to change from icc
to gcc, be sure to change the appropriate CFLAGS and LDFLAGS variables as
well, removing the -xW option.

The Makefile also requires external definition of the environment variable
DRMS to be the root of your NetDRMS distribution as described in the
installation instructions. If you get an error message like:

error: could not open source file "cmdparams.h"

it is probably because you have not correctly set the DRMS environment variable.
(Again, if you wish you could define DRMS in the Makefile rather than as an
environment variable.)

The Makefile assumes that all third-party libraries, in particular those for
cfitsio and postgres that are essential for DRMS, reside in one of the default
search locations, /usr/lib or /usr/local/lib. If they are elsewehere, then
you will need to add the base path to the LIBD definition line in the Makefile.

In order to build a debug version of any of the modules in the project (with
symbols and without optimization), you can set the environment variable JPLAT
to debug. The debug "platform" variables are identical to those in the
linux_x86_64 file, except for the C compiler options. Since it is very likely
in debugging that you may need symbols in the DRMS library as well, you should
build a debug platform version of that library by running (in the DRMS root
directory):

	setenv JSOC_DEBUG 1
	make MACH=debug

********************************************************************************

Recipe 1: How to write and build a minimal DRMS module

The program smpl_01.c is a true DRMS module; it exhibits the minimum structure
necessary for a C program that will be able to interact with the DRMS API to
fetch and store date in the DRMS and SUMS. This module in fact does neither,
as it is identical in functionality to smpl_00.c; but it nevertheless opens
the required connection to the DRMS, and will fail if that is not possible for
some reason, such as the database server being unavailable or the user lacking
proper authorization to access it. It can therefore serve as a useful test for
minimal connectivity to DRMS.

Note that this program does not have a main() entry, it is instead called
DoIt(). All DRMS modules are called DoIt() and link against another main()
program in the DRMS library that opens a connection to DRMS and SUMS for
them, calls DoIt(), and then closes the connection, saving or discarding
the results according to the return status of the module.

In addition to the "print" argument, this module also features two "flag"
arguments. If you run
	smpl_01 -v
it will print out a little more information about what it is up to (this is
a common flag), and if you run
	smpl_01 -a
it will force an abort, that is, returns a non-zero value to the main program.
You can also run
	smpl_01 -av
or	smpl_01 -a -v

You can get a list of the arguments, and default values (if any), by typing
	smpl_01 -H
This is a feature of all DRMS modules, as it is handled by the (hidden) main
program.  Likewise, the command
	smpl_01 -V
results in some extra verbose information from the main program itself before
and after the module is called.

********************************************************************************

Recipe 2: Command-line Argument Parsing

Module smpl_02 also has no interaction with the DRMS database. It shows the
full range of command-line argument types and how they are to be processed
in the module. Type
	smpl_02 -H
to see the full argument list and then fool around with various values for the
arguments. Most of the argument types should be self explanatory. ARG_INTS and
ARG_FLOATS can be used to specify arrays of arbitrary length (including zero)
of integers or floats, comma separated and enclosed (if there are more than one)
within matching pairs of either brackets, braces, or parentheses. ARG_NUME is
used for an enumerated list of possible strings, returning the order of the
selected string in the enumerated list, similar to an enum declarator in C.

********************************************************************************

Recipe 3: Communicating with the DRMS Database - show_series

Module smpl_03 shows how direct communication with the database can be done
through a module. It is basically a simple version of the show_series
application.  The module still bypasses most of the API. Normally you should
not need to communicate with the database at such a low level in a module, but
this shows how to do it if you are familiar with SQL and need more power and
flexibility than is provided by the record management API. You can use this
module, like show_series, to see what data series are already in your DRMS
database.

********************************************************************************

Recipe 4: Beginning to Use the API - show_info

Module smpl_04 is the first one to actually use the DRMS record management
API. It is a (very) simplified version of the show_info application. You can
use this module to begin to explore the various data series in your DRMS.

********************************************************************************

Recipe 5: Adding Records to a Data Series

Up to this point, all of the cookbook modules operated on data series that
already existed. At some point, you need to be able to create and populate
data series yourself. Module smpl_05 is an example of a module that can be
used to add (or update) records in a data series, in this case one containing
records correspnding to the daily NOAA reports of individual solar active
regions with spots. The data from those reports has been assembled in a
simple ASCII table text form in the file noaa_ar.dat. The module smpl_05
reads data from that table and inserts the appropriate records in a data
data series. In order to exercise it, you will first need to create a data
series with the appropriate structure that you can write to. To do so, run
	create_series noaa_ar.jsd
You will probably have to change the Seriesname from drms.NOAA_ar to
something in a different namespace from drms; each user should have a
personal namespace. Then you can run the module with the appropriate value
for the "ds" argument.

Any time you want to remove this series (to try out different series
modifications in the JSD for example), you can delete it by typing
	delete_series drms.NOAA_ar
(or whatever its series name is) and answering yes to the safeguard prompts.
If you do not change the series name in the JSD, you can also force its
deletion and recreation by typing
	create_series -f noaa_ar.jsd
If you give it a new series name of course it will not remove the old one.

Once you have created the data series, you can run
	smpl_05
The default value for the ds parameter is drms.noaa_ar, so you will have to
run the module with a different series name for the "ds" argument the series
you created has a diffeent name. The default value for the "data" argument
is noaa_ar.dat, a file included in the cookbook.

********************************************************************************

Recipe 6: Modifying Records in a Data Series

If you inspect the last few records in the data series just created, for
example with
	show_info -a ds= drms.noaa_ar n= -5
you will discover that the penultimate record, that for the observation of
AR 11024 on 2009.07.10, has a very strange and probably erroneous value
for its latitude, -9 deg, when the latitude on the preceding and following
days was -25 deg. This is the result of what was presumably a typographical
error in the NOAA/USAF report for that day (SRS Number 191 Issued at 0030Z
on 10 Jul 2009). Module smpl_06 is an example of one that could be used to
correct or update the values of selected records in a data series. It shows
how runtime parameters (and internally computed variables) can be used in
place of record set specifiers to restrict the recordset, and how key values
in selected records can be "changed".  In order to fix the relevant record,
you could run either
	smpl_06 date= 2009.07.10 ar= 11024 key= lat value= -25
or
	smpl_06 ds= "drms.noaa_ar[2009.07.10][11024]" key= lat value= -25
and then reinspect the last few records. The record in question should have
been updated with a new value for its latitude.  In reality, the DRMS API
does not allow for actual value changes to records in the database, which is
why the function drms_clone_record() is used in this update module. If you
type
	show_info -ar ds= "drms.noaa_ar[\!Region=11024\!]"
you will see that there are really two entries in the database with the same
values for the two prime keys Date=2009.07.10 and Region=11024. The one with
the higher recnum is the only one that will show up in ordinary queries.

********************************************************************************

Recipe 7: Creating Records with Data Segments

The data series worked with in the previous two examples had no associated
data segments, repositories for bulk data such as images or other data arrays.
The JSD file images.jsd describes a series with the opposite structure. It
contains no keywords at all in the database (except for the hidden keywords
like recnum, so that all records are unique). However, it does allow for a
segment, which can be any 2-dimensional array of short ints (which can
represent scaled floats), stored internally as Rice-compressed FITS binary
tables.  As in the preparations for Recipe 5, run
	create_series images.jsd

You can then run smpl_07 to create records (one at a time) in this series.
Each will have a data segment of specified size and shape, with the values
filled randomly using a somewhat Byzantine procedure to create a semblance
of large-scale structure. You might want to fool around with different values
of the parameters dist and seed.  In order to view the data segments, run for
example
	show_info -pq ds= drms.images"[]"
and then run a viewer, e.g. ds9, on the fits files in the displayed SUMS
directories.

Note a few interesting features of the JSD file you used to create this data
series. First the Archive flag is a negative number. Normally the Archive flag
is set to 1 if the data segments are to be archived to tape before they age
off the SUMS disks (in 10 days in this case), 0 otherwise. However, if the
data segments are allowed to disappear without having been archived to tape,
the data records in the DRMS will still remain. Sometimes that is useful, but
in this case, with a series that contains no ancillary data at all, that would
be pointless. The value of -1 will force the DRMS to actually remove any records
when their corresponding data segments age off the disk.

Also, note that the module automatically created FITS files named v.fits
with binary table row compression. That is because of the parameters specified
for the series Segment "Data" in the JSD file. You might want to experiment
with different values for the various fields in the Segment descriptor, "name",
scope, "type", "naxis" (which would cause the module to fail if changed),
"axis_n" (which must be specified if scope is variable rather than vardim and
might cause the module to fail if inappropriate values of parameters are
chosen, "protocol", the compression and scaling parameters. In order to
conduct such experiments you must either create a new series with a different
series name by specifying a different value for Seriesname in the JSD, or, if
you wish to keep the series name, recreate the series afresh, since its
essential structure is being changed. To do the latter, run:
	delete_series drms.images
(or whatever name you gave the series) and then
	create_series images.jsd
Alternatively, you can just run:
	create_series -f images.jsd
which will force deletion of the existing series if it has the same Seriesname.

********************************************************************************

Recipe 8: Analyzing Data in Record Data Segments

Now that you have created a data series with records containing "real" data,
you can begin to explore the functions that fetch data from SUMS segments into
memory for processing.

smpl_08.c is a simple program that calculates and prints the primary statistics
(count, mean, standard deviation) of each segment (or a selected segment) of
a selected data set. It takes as arguments the input data set specification and
optionally a segment name as well. The default record set specified is the
last record inserted into the data series by smpl_07. (If you gave that series
a different name from drms.images you will either have to change the default
value for the "ds" argument in smpl_08.c or provide the correct series name
as the value of the argument ds when you run smpl_08.

Karen Tian
Powered by
ViewCVS 0.9.4