AJAX Interface for the JSOC Export System
The Export System allows users to request subsets of data from the JSOC, exporting images and data in a variety of file formats.
Plan:
- Implement basic access to keywords and segment information via CGI-BIN GET and POST server programs to allow Javascript access to basic JSOC functions.
- Provide basic browser side html to demonstrate capability and serve as a starting point for simple use and development.
- Publish the spec for the interface to allow others to build better Client-side tools.
Overview of Export Management
Exports will be managed in the series jsoc.export. Each request will be given a RequestID which can be used to track status. Data to be exported can be requested via several methods. The data requested will be collected (e.g. tar) and placed in a data segment in a record in jsoc.export. It will be exported (i.e. picked up by the requestor) directly from the SUMS directory.
JSOC exports will not be archived and retention time of 7 days. Thus the Requestor will have one week to fetch the data. The export record will be retained for a history log of exports.
The jsoc.export series used to manage the export process looks like:
Prime Keys are: RequestID DB Index Keys are: RequestID All Keywords for series jsoc.exports: DataSet (string) Dataset requested ExpTime (time) Time of export FilenameFmt (string) File basname format Processing (string) Data processing request, beyond Protocol choices. Protocol (int) Data export protocol, options include as-is, fits, fits compressed. Method (string Data transfer method, e.g. url, ftp, ftp-tar, ... Notify (string) Notification address ReqTime (time) Time of request RequestID (int) Export request identifier Requestor (string) Name of requestor Size (int) Volume of data requested Status (int) Status of request Segments for series jsoc.exports: Data NA generic VAR Exported data
Status of prototype implementation of the draft plan
Many of the functions described below are now implemented in series_info and in a new module called jsoc_info which has heritage in show_info but emits JSON responses. Several open-source tools have been used in developing jsoc_info as well as http://jsoc.stanford.edu/ajax/lookdata.html which is a test client-side javascript code to use jsoc_info. These include:
Prototype Javascript Framework at http://www.prototypejs.org/ which provides AJAX style javascript functions enabling much easier jsvascript coding.
The JSON home which describes the XML-like JSON data protocols with links to software.
M'sJSON JSON parsing and generating code which is used in jsoc_info.
Prototip Popup help box code used in lookdata and exportdata.
The qDecoder library is used to support cgi-bin programs for GET and POST processing.
And hints for compatibility found in w3schools DOM information pages.
The operations now supported at http://jsoc.stanford.edu/cgi-bin/ajax are:
- Implemented via show_series
series_list -- Implemented within "show_series". Expects a single parameter "filter" which is used in regex to limit the number of seriesnames returned. See the man page for show_series. If e.g. all series with the word "mdi" are desired then the URL will be: http://jsoc.stanford.edu/cgi-bin/ajax/show_series?filter=mdi%5C.
- Implemented via jsoc_info
series_struct -- Like "show_info -l -s" Example URL for e.g. the series mdi.vw_V_lev18 is: http://jsoc.stanford.edu/cgi-bin/ajax/jsoc_info?op=series_struct&ds=mdi.vw_V_lev18 which returns the JSON version of the information in the shell commands "show_info -l mdi.vw_V_lev18" and "show_info -s mdi.vw_V_lev18" combined. This operation generates a javascript object which contains properties that represent the various component of a series: keywords, segments, links. It also contains series-wide information like retention time, unit size, archive flag, tapegroup number, prime-key constituents, and a list of keywords that have database indices.
rs_summary -- Implemented in jsoc_info. Example URL is http://jsoc.stanford.edu/cgi-bin/ajax/jsoc_info?op=rs_summary&ds=mdi.vw_V_lev18 This operation generates a javascript object that contains three properties named count, runtime, and status. The value for count is the number of DRMS records in the series specified with the ds argument. runtime is the time, in seconds, that elapsed during the execution of the underlying jsoc_info command. status is ??? (it is always 0).
- rs_list -- Implemented in jsoc_info. Functions similar to the shell command "show_info" to get dataseries record attributes.
- Implemented via jsoc_fetch
- exp_request -- Primary data export request tool. Expects a recordset specification and desired export method. Exported data will be either delivered immediately or kept online for 7 days depending on calling parameters and online status of the data. If a the data is not immediately available a "RequestID" is returned to the user for later access to the exported data.
- exp_repeat -- Repeats the request for an expired (export data package is gone after 7 days). Expects RequestID.
- exp_status -- Checks on status of prior request, expects RequestID.
- exp_su -- Special tool for export of SUMS "Storage Units" to remote DRMS sites.
The details are below, but some examples here may help. The lookdata call limits the recordset to 10,000 records. The present implementation can take a minute or more for 10,000 records. Use care. This and following functions expect an explicit recordset query which will resolve to a subset of the records in the given series.
The example URL here returns a few keywords for 5 minutes (5 records) from the mdi.vw_V_lev18 data.
http://jsoc.stanford.edu/cgi-bin/ajax/jsoc_info_test?ds=mdi.vw_V%5B1996.05.01%2F5m%5D&op=rs_list&key=DATAMEAN%2CT_OBS
- Note the excape characters for URL encoding: '[' to '%5B', ']' to '%5D', '/' to %2F, ',' to '%2C' etc. The return json text from this query, with white space added, is:
{ "keywords":[{"name":"DATAMEAN", "values":["321.424972", "320.428355", "320.296396", "319.591784", "320.101514"] }, {"name":"T_OBS", "values":["1996.05.01_00:01:00_TAI", "1996.05.01_00:02:00_TAI", "1996.05.01_00:03:00_TAI", "1996.05.01_00:04:00_TAI", "1996.05.01_00:05:00_TAI"] } ], "segments":[], "links":[], "count":5, "status":0 }
Note that the same query can be made to the shell command, 'show_info' which is also available as a cgi-bin call alongside jsoc_info. Show_info returns text while jsoc_info returns json. In this example "wget" is used to get the response as plain text in the file "pt"
wget -O pt 'http://jsoc.stanford.edu/cgi-bin/ajax/show_info?ds=mdi.vw_V_lev18%5B1996.05.01%2F5m%5D&key=DATAMEAN%2CT_OBS'
- with the response in pt:
DATAMEAN T_OBS 321.424972 1996.05.01_00:01:00_TAI 320.428355 1996.05.01_00:02:00_TAI 320.296396 1996.05.01_00:03:00_TAI 319.591784 1996.05.01_00:04:00_TAI 320.101514 1996.05.01_00:05:00_TAI
The prototype access web tool "http://jsoc.stanford.edu/ajax/lookdata.html" uses show_series and jsoc_info. The prototype export tool "http://jsoc.stanford.edu/ajax/exportdata.html" uses jsoc_fetch. Development versions, sometimes not even 'beta' level and possibly unstable, but possibly useful are lookdata2.html and exportdata2.html.
NOTE: The browser tools lookdata.html and exportdata.html are intended to work with standards compliant browsers. Functionality with non-standards based browsers, e.g. IE, will be nice but accidental.
Implementation Plan
One of the access tools to initiate an export are a set of web cgi-bin modules using get and post methods. These allow simple browser access via direct html, javascript, or shell access via wget or curl.
These web accessible modules communicate using the JSON protocols (the functionality of a stripped down XML) and later XML if needed. These are not expected to be directly usable browser tools, they are components to be used by web pages provided separately, e.g. lookdata.html.
JSOC cgi-bin export programs
A set of basic operations to allow identification of series, lists and meaning of keyword metadata in those series, the range of dates present, etc. is now available. Another set of operations needed to allow access to keyword values and direct access to the files containing the bulk of the data are alos functional. These are built as operations with an "opcode" first parameter as shown and implemented via several cgi-bin programs.
Usage
The basic idea is that the user (here meaning browser javascript or shell script wget calls) will make a sequence of requests to build up a fully specified JSOC DRMS "record set" query. Then that query will be used to fetch some desired data or metadata. If only metadata is needed, the response will be "immediate". If file data is needed and is online then the request can also be provided immediately. If the online status is not known or is known to be offline then an export request can be submitted, a RequestID is returned to the user, then that RequestID can be used in subsequent polling to determine when the data is actually available and to get the link for data access. In both the immediate and request-respond methods the data will in the end be provided via a URL or FTP address.
At some point this process can be expanded to allow the user to build up an "export cart" like a shopping cart which will contain a compound recordset query. The first implementation described below will only support exports from a single series per request. This is sufficient if the "export cart" in managed on the user side.
An optional userhandle may be sent for some requests. After the request is sent this same string may be sent with op=kill to abort the previously requested processing. The program jsoc_userkill implements this function.
Synopsis
jsoc_{something} op=<command> {other arguments as specified below in Expects list}
Description:
Each of these commands can be executed locally, via a browser presumably from a Javascript program, or via "wget" or similar program.
op=exp_kinds - get list of export methods with rules and limits of use. Usage: use this call to get list and restrictions for export methods and protocols. Use to inform user of choices that will be needed later. Could be expanded to be part of login handshake with Requestor to establish preferred method and verify limits for that user. Implement when better understood. Methods might be: ftp(for push), http(immediate), email(delayed), tape(mail), url(for pickup) Protocols might be: FITS_tar, FITS_zip, jpg, mov, etc.
NOTE: This does not exist yet. Probably obsolete. Do not expect it soon.
op=series_list - get list of series matching specified filter, like show_series. Usage: use this call to get a list of target series for further examination Method: GET Expects: * a ds parameter containing a series filter. Returns (as JSON): * status - returns 0 if OK, 1 if series not found, -1 if the backend process was terminated (typically when the user cancels the export request). if status is 1 returns element "error" containing error message * n - count of the seriesnames matching the query * names- an array of series information containing: * name - series name * primekeys - an array containing list of prime key names * note - descriptive text for the series
NOTE: The ds filter is a regular expression to match seriesnames. A prefix of "NOT" will exclude names matching the filter. This function is fully implemented as a call to show_series. From browser use cgi-bin/show_series?ds=<filter> (may omit the ? and arg to ask for all series) wget may be used, e.g.:
wget -O list.json http://jsoc.stanford.edu/cgi-bin/ajax/show_series?ds=hmi
NOTE1: the -v flag of show_series will now provide additional information including retention, archive, unitsize, and owner. To access this include an additional "expect" parameter of "v=1".
op=series_struct - info for "seriesname" gives list of keywords as show_info -l and show_info -s combined Usage: use this call to get structural contents of a series with summary of data coverage. This info can provide info needed to formulate a request for contents of key values or data arrays. Method: GET Expects: * ds param with seriesname * userhandle - param with unique session ID, optional but allows user kill if needed. Returns (as JSON): * status - returns 0 if OK, 1 if series not found, -1 if the backend process was terminated (typically when the user cancels the export request). if status is non-zero returns element "error" containing error message * runtime - time in seconds for server processing * note - descriptive text for the series * archive - 1 means data is archived, 0 means not archived * retention - number of days data retained in SUMS * tapegroup - SUMS tapegroup number * unitsize - max number of records per SUMS storage unit * owner - JSOC db owner of series * primekeys - an array containing list of prime key names * dbindex - an array containing list of DBindex key names * keywords - an array of keyword info containing: * name - keyword name * linkinfo - present if and only if the keyword is linked, contains link name and target keyword name * type - keyword type, e.g. int, string * recscope - keyword scope * defval - default value for keyword * units - keyword units * note - descriptive text for the keyword * segments - an array of segment info containing: * name - segment name * type - array data type * units - data units * protocol - storage protocol for segment file * dims - segment array dimensions * note - descriptive text for segment * links - an array of link info containing * name - segment name * target - series name for target of the link * kind - "static" or "dynamic" * note - descriptive text for the link * interval - a struct containing first and last info * FirstRecord - contains query that will match the first record based on the first prime key * FirstRecnum - contains the recnum of the FirstRecord * LastRecord - contains query that matches the final record based on the first primekey. * LastRecord - contains the recnum for LastRecord * MaxRecnum - contains the highest recnum in the series.
- NOTE: This operation should be fast, the user can expect a prompt reply from the server. NOTE: this function is now implemented as jsoc_info, example:
- jsoc_info ds=mdi.vw_V_lev18 op=series_struct
or
op=rs_summary - get recordset summary info for "record_query", use to refine query. Usage: use this call to probe the expected return for a given query. Can be used to estimate appropriateness of the query for the job at hand. With extensions can be used to probe completeness, etc. NOTE: this call may be slow if the series is large. The user should be patient. Method: GET Expects: * ds containing simple record_query (i.e. only one series spec). Returns (as JSON): server gives count of records and some other info, online, size, etc. some coverage statistics based on completeness within recordset. Maybe a bar plot of coverage in some bins. Start with just count of records. * status - returns 0 if OK, 1 if series not found, -1 if the backend process was terminated (typically when the user cancels the export request). if status is non-zero returns element "error" containing error message * runtime - time in seconds for server processing * count - number of records matching query.
- NOTE: this operation is implemented in jsoc_info taking op=rs_summary and ds=recordset
op=rs_list - get recordset list expanded with selected keyword and segment values. Basically like show_info with key= and seg= and if seg= then -P args. Usage: use this call to get detailed information from DRMS, record names, keyword values, full paths to online data, etc. Can be final query for some tasks where keyword values are sufficient. Can provide a list of records that can be further sub-selected based on keyword values, etc. Method: GET Expects: * ds - containing recordset query, required * key - keyword name list, optional, if not present all keys are processed * seg - segment name list, optional: if not present all segs are processed. * link - record link name list, optional, if not present NO links are processed. * userhandle - param with unique session ID, optional but allows user kill if needed. Returns (as JSON): * status - 0=OK, 1=query failed, -1 if the backend process was terminated (typically when the user cancels the export request). if status is non-zero returns element "error" containing error message * runtime - time in seconds for server processing * count - number of records returning values * keywords - array of objects containing * name - keyword name * values - array of <count> values for that keyword. * segments - array of objects containing * name - name of segment * dims - array of <count> strings containing segment file array dimensions * values - array of <count> pathnames to segment file * links - array of objects containing * name - name of link * values - array of <count> record queries specifying the link target record * recinfo - array of count objects containing: * name- name of record as query, as in the # line of show_info -k * online - 1=online 0=offline
NOTES: Now implemented in jsoc_info.
Jsoc_info understands keyword, segment and link with names "**NONE**" in the key and seg parameter lists as flags that no keys or segs are wanted.
Also jsoc_info recognizes "**ALL**" to mean the obvious, for any of key, seg, or link.
Jsoc_info also recognizes the psuedo keyword names of "*recnum*" "*sunum*", "*size*", "*online*", "*retain*", and "*logdir*"
It returns the recnum, sunum, storage unit size, online status, retention date, and log directories if those are specified, respectively.
The "**ALL**" flag does not prevent explicit keywords to be listed too, in that case those keywords will appear twice.
Starting in Oct 2008 all keyword values are returned as json strings. This allows floating NaNs to be returned as "nan" instead of a bunch of "9"s. It also accommodates octal and hex formats.
At present the "*online*" and recinfo.online contain the same information in different forms. *online* returns "Y" or "N" vs recinfo.online which returns "1" or "0". Also, recinfo.online is only valid if a segment was specified while *online* is always valid.
Note that sunum, size, online, and retain refer to the SUMS storage unit associated with the selected record and that a storage unit may contain data from multiple records if the unitsize parameter in the series structure is greater than 1.
op=rs_image - get raw data or thumbnails for recordset for selected segments This call generates a request in jsoc.exports only if needed to get space to make images. We need to develop a mechanism to avoid lost of duplicate image making. Usage: - can be used along with rs_list to get selection info to better define desired data. Can also be used to get direct URL of online generic data. Method: POST Expects: (as JSON) * ds - recordset spec for single series * seg - list of segments for which thumbnails are requested. Can be omitted for all. * protocol - file as is: nop; single images: gif, jpg, png; or movie: mov, etc. Returns: * status - 0=OK, 1=failed, -1 if the backend process was terminated (typically when the user cancels the export request). if status is non-zero returns element "error" containing error message * count - number of records returning images. * segments - array of segment names for which images are available * recinfo - array of record info containing: * record - name of record as query * <segname> - array of <count> URLs pointing to file or thumbnails images or URL of movie. ...
Note: no development of this option yet.
op=exp_request - request export of recordset with data. server examines request if immediately available and method is url_quick then send status and data else if small enough and is online but needs export processing then initiate processing and return status and RequestID. else estimate processing and/or size info and return status along with options and RequestID. Note the query can be concatenation of several record sets. I.e. user can make datacart. Usage: Primary data request tool when data files desired. Method: POST or GET - user should protect against multiple identical requests. Expects: About the data: * ds - contains recordset query * process - Requested processing prior to export in desired protocol, default is "n=0,no_op". See below. * protocol - file conversion request. At present options are: fits, as_is, jpg, mpg, mp4. See Note re compression settings. * filenamefmt - rule for export filenames, default: {seriesname}.{recnum:%ld}.{segment_filename} About the communication: * format - format of returned information, defaults to "json", options will be: json, txt, html, maybe xml * method - name of export method, e.g.: url, url_quick, and later: ftp, http, email, tape. About the user: * requestor - ID of user, can be random or known Requestor. * notify - email address of user. May be omitted unless method is "email" or "tape". * file - uploaded file in case ds == "*file*" About the request process itself: * userhandle - param with unique session ID, optional but allows user kill if needed. Returns in the specified format: * status - 0=OK immediate data available or queue managed data is complete 1=request received and action is pending, i.e. in processing 2=queued for processing 3=request too large for automatic requests 4=request not formed correctly, bad series, etc. 5=request old, results requested after data timed out. -1=the backend process was terminated (typically when the user cancels the export request). if status is > 2 returns element "error" containing error message * requestid - RequestID of record in jsoc.exports * dir - Directory in the JSOC system where exported data is located. * data - an array of count objects containing: * record - name of record as query with segment name as suffix * filename - file or link name of the requested data. * size - bytes of data to be returned if positive, or -1 if not known yet. * rcount - number of records found in the recordset. * count - number of files returned in the data array. * method - copied from input, but url_quick may be reported as "url" if applicable. * protocol - copied from input. * wait - estimated seconds until data is available if status==1. * error - message, only present if status > 2 * contact - email address, name to contact if status >2. user should contact with RequestID.
Note: This is now implemented via "jsoc_fetch". The "process" field allows passing one of the approved on-demand processing requests for JSOC data. This field also is the mechanism for passing a record count limit. the record limit is passed as a leading "n=<count>" where n=0 means no limit. the subsequent strings start with a processing name followed by none or more comma delimited parameters specific to that type of processing. The default is "n=0,no_op" meaning that no processing will be done beyond protocol conversion if needed. Other processing presently includes a trial version of on-demand extraction of solar disk passages of a selected patch. See NOTE-2 below for details. Note on method=url_quick: If the protocol is as-is and the full RecordSet is online, data will contain a JSON array of pairs of query names and full URLs, one for each record. This is similar to the op=rs_list format for segments. If the data is not all online the url_quick will be treated as if it were "url". If the "url_quick" request was successful, there will be no record of the export in jsoc.exports and RequestID will be empty, and Notify and Requestor will have been ignored. For normal exports, i.e. method=url, the data may be accessed by creating the URL from "http://jsoc.stanford.edu/" + the contents of the "dir" variable + "/" + the contents of a "filename" variable.
Note: For FITS export protocols, compression parameters to be passed to the cfitsio library may be specified as a suffix to the "fits" protocol word, as a comma separated list of strings. If there is no "," after "fits", the exported data will have the same compression as the DRMS files. One string may be present for each segment name being exported, in the order that the names are encountered in the jsd. If more segments are requested than compression strings are given, the last one will be reused as many times as necessary. If an uncompressed file is desired it is best to include the opt-out string consisting of "**NONE**".
NOTE: Compression parameters are now ignored, the internal compression parameters will be used for export. changed in 2014.
Option for providing the desired RecordSet via an uploaded file is now implemented. For this case the "ds" param must be "*file* and the POST method must be used. In this case the assumption is that a possibly hidden iframe is used and the returned json script will be in the implied <body> tag so content type will be text/html rather than json. The uploaded file should be provided with the field name "file" and should consist of lines of RecordSet specs.
Note: presently testing file upload in jsoc_fetch and exportdata.html. Present limit of 8192 chars in the uploaded file.
NOTE-2: On demand processing. There are now 2 options allowed which are implemented in exportdata.html. In all cases the parameters of an on-demand processing program are passed in the "processing" field. The parameters are passed in a comma delimited string. The After the limit count (n=<count>) the first field of this parameter string must be the name of the program to do the requested processing. Subsequent fields are param=value pairs containing any needed command-line arguments for the processing program, other than those available in explicit fields, e.g. 'ds' above. The ds=RecordSet spec must result in a non-zero length recordset which is small enough to export - even though the processed data may be smaller. The extra processing is handled by jsoc_export_manage generating a script which first runs the desired program which should generate data in a normal dataseries. It should also write a line in a logfile for each record created while satisfying the desired export. Next that file containing a new recordset is passed to jsoc_export_as-is or jsoc_export_as_fits as appropriate from the 'protocol' parameter above, and the export is made. See the program hg_patch man page for details of parameters for the extracted regions.
op=exp_status - request status of pending request. Usage: part of handshake if exp_request returned status==1. Method: GET Expects: * requestid - RequestID of pending request * format - format of returned information, defaults to "json", options will be: json, txt, html, xml Returns in the specified format: * status - 0=OK immediate data available or delayed request in queue, 1=processing, 2=queued for processing, 2=large request needs manual confirm, 3=bad recordset. 4=request not formed correctly, bad series, etc. 5=request old, results requested after data timed out. 6=RequestID not regognized, probably need to repeat in a few seconds. -1=the backend process was terminated (typically when the user cancels the export request). if status is 3 or above, returns element "error" containing error message * requestid - RequestID of record in jsoc.exports * data - an array of count objects containing: * record - name of record as query with segment name as suffix * filename - file or link name of the requested data. * dir - URL tail of directory containing the returned data. * size - bytes of data to be returned if positive, or -1 if not known yet. * count - number of files returned in the data array. * method - copied from input, but url_quick may be reported as "url" if applicable. * protocol - copied from input. * wait - estimated seconds until data is available if status==1. * contact - email address, name to contact if status==2. user should contact with RequestID. * DATA section (present only if the export request was initiated by an exp_su cmd - see below) - contains 5 columns: 1. storage-unit number, 2. owning series name, 3. path to storage unit, if known, 4., storage-unit status (Y - online, N - offline but archived, X - neither online nor archived, and therefore not retrievable, I - invalid storage-unit number).
Note: this is now implemented in "jsoc_fetch". Note on dir URL contents: The provided URL will be a directory containing some files with standard names and an "index.html" that will provide information about the exported data in web form. The directory will contain a "packing-list" file which will be a table with a row for each data file. The row will contain the DRMS record query that resolves to the record and a filename which can be concatenated onto the string "http://jsoc.stanford.edu/" and the "dir" string to be a URL for the file. If the data is "as-is" the files will be links to the actual segment files for the selected records. If some processing has been done, the data will be in files possibly tarred together and the packing-list will be a catalog of the tar file. The files: index.html, index.json, index.txt, (later maybe index.xml} will be present all containing the same information. The index.json file contents will be the same as the returned json text if format=json. Additional files will be present in the form {RequestID}.{extension} where the present extensions are "qsub", "drmsrun", and "env" which contain the qsub script and drms_run scripts run to make the export, and the shell environment during the drms_run session. If status==4 returned the export failed probably due to a processing error. Ask the JSOC help staff to examine ~/jsoc/exports/tmp/<RequestID>.runlog for the processing error messages.
op=exp_repeat - This call initiates a re-export of a prior export which has expired. Only the RequestID need be provided, and optionally new requestor contact info for email notification. Usage: Used to renew an expired export, results in repeat of any needed calculations or processing. Mdthod: GET or POST Expects: * requestid - RequestID of record in jsoc.exports from prior successful export request. * notify - email address of user. May be omitted. * userhandle - param with unique session ID, optional but allows user kill if needed. Returns: Same as for exp_status.
Note: If RequestID corresponds to a current online request, the retention will be extended and no further action will be taken, return will be as if exp_status was requested. Otherwise, an export record (in jsoc.export) with a status==0 will be updated in the corresponding record in jsoc.export_new with a status=2 and a possibly new notify address. The re-exported product will have the same RequestID but a new sunum.
Implemented in jsoc_fetch_test and exportdata2.html.
op=exp_su - This call initiates export of a StorageUnit to a remote DRMS. If data are online and available at the time the call is made, then a status of 0 is returned, and a list of per-SU data are returned. If the data are not online and archived, this call will initiate an asynchronous retrieval of data from SUMS. In that case, polling for retrieval completion can be achieved with exp_status calls. Usage: Used by remote DRMS to get needed SU. Complete with exp_status calls. Method: GET or POST Expects: * requestor (optional, used only in the case a retrieval request is made) - name of remote DRMS site. * method - name of export method, e.g. ftp, tape, url, url_quick * sunum - comma-separated list of storage unit numbers, if present takes precedence over the ds param * ds - comma-separated list of storage unit numbers, may use if sunum param is absent * protocol - specifies the type of files that the exported storage unit should contain (as-is or FITS) * format - the format of the response (txt, json, html, xml) * formatvar - variant of the format, for format=json, formatvar=dataobj will cause the property named "data" to be an object containing storage-unit objects instead of an array of storage-unit objects. Returns: * status - 0=OK all data are online and available at the time of the exp_su call, 2=an asynchronous data retrieval request was made because not all data were online, -1=the backend process was terminated (typically when the user cancels the export request). * requestid - RequestID of record in jsoc.exports * protocol * size - bytes of data to be returned if positive, or -1 if not known yet. * wait - estimated seconds until data is available if status==2. * DATA section - contains 5 columns: 1. storage-unit number, 2. owning series name, 3. path to storage unit, if known, 4., storage-unit status (Y - online, N - offline but archived, X - neither online nor archived, and therefore not retrievable, I - invalid storage-unit number).
Note: This is implemented as an option in jsoc_fetch and the return status can be examined with op=exp_status. Thus the return values will be made available using the same methods as op=exp_request. This is why the names are a bit different than in earlier versions of this document. Under the op=exp_su mode, jsoc_fetch will accept both HTTP GET and HTTP POST requests.
op=exp_history - this call gives a remote requestor a log of prior requests. Usage: Used by remote users to manage their requests. Expects: * requestor - Requestor ID of previous originator of data export requests. * activeonly - Boolean, if present request will only respond with requests that have not had a status=0 returned from an exp_request, exp_status, or exp_su call. * requestid - if present info for only this requestid will be returned, if requestor matches. Returns: * status - 0=OK, 1=requestor unknown, -1 if the backend process was terminated (typically when the user cancels the export request). if status is 1 returns element "error" containing error message * count - number of returned RequestIDs * requests - array of request descriptions containing * requestid - RequestID * ds - recordset query used * exptime - Time of export request * FilenameFmt (string) File basname format * Format (int) Data format code * Notify (string) Notification address * ReqTime (time) Time of request * Size (int) Volume of data requested * Status (int) Status of request
NOTE: we need to establish some password protection for this request. The names and email addresses of requestors along with the details of their export requests will not be public information. It will be maintained for statistical purposes and to allow notification to the requestors if the data they have exported is found to be faulty, poorly calibrated, etc.
op=kill - This call attempts to abort a prior call made with the same userhandle Expects: * userhandle - param with unique session ID, optional but allows user kill if needed. Returns: * status - 0=OK, 1=Failed, -1=the backend process was terminated (typically when the user cancels the export request).
Note: userhandle can be created by appending the current time to a string returned by the cgi-bin program jsoc_WebRequestID to generate a unique handle for each call. see e.g. setting RequestHandle in lookdata.html. the program jsoc_userkill implements the kill operation. When a call is made to jsoc_info or jsoc_fetch with a userhandle parameter that userhandle will be saved on the JSOC side until the request is complete. If during the processing a call to jsoc_userkill is made with a matching userhandle, the program servicing that request will be terminated.
Client Side
A sample tool is now at http://jsoc.stanford.edu/ajax/lookdata.html that functions with show_series, jsoc_info and jsoc_fetch (all in http://jsoc.stanford.edu/cgi-bin/ajax/). lookdata supports calls to all of the above functions that are noted to have show_series, jsoc_info, and jsoc_fetch implementations.
Lookdata may be used as examples that function to provide building blocks for a more capable and more friendly user experience.
Each of the jsoc_info, show_series, and jsoc_fetch operations implemented has also been verified to work via wget calls. A useful application built using wget will need a JSON parser compatible with the scripting language chosen. The www.json.org page links to a number of available implementations. A jsoc_fetch script should do one exp_request call then loop on the return value of status in exp_status calls until a "0" is returned. The return containing the status=0 will also provide the json with the full result. At some point, the option of plain text for easier shell scripting will be available.