Differences between revisions 13 and 32 (spanning 19 versions)
Revision 13 as of 2015-09-30 04:13:40
Size: 13483
Editor: ArtAmezcua
Comment:
Revision 32 as of 2020-01-17 06:03:19
Size: 15599
Editor: ArtAmezcua
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
= How-To Subscribe to Data Series Created and Maintained by the JSOC-SDP = = How To Subscribe to Data Series Created and Maintained by the JSOC-SDP =
Line 3: Line 3:
A NetDRMS site wishing to mirror (obtain and maintain a copy of) a DRMS data series from another site may subscribe to that series. Such series must first be published by the serving site. (see Series Publication Notes) To subscribe to a data series, a privileged operator runs the subscribe_series client script to request a subscription. This script, which can be run as many times as desired, allows the operator to subscribe to one or more series at a time. The first time it is run, the server creates a directory for the site into which site-specific slony logs will reside. These logs will contain statements germane only to the set of series to which the site is subscribed. == Background ==
Line 5: Line 5:
In brief, the server satisfies a client request by sending to the client a file containing a SQL dump of the main series table to which the client is subscribing. The file will also contain other set-up commands depending on the status of the client as a new subscriber or not. For example, if a client requests a subscription to hmi.M_45s, then the dump file will contain commands to create the hmi namespace (if it does not already exist), the hmi.m_45s table (it must not exist at subscription time), the appropriate entries in the hmi.drms_keyword, hmi.drms_segment, hmi.drms_link, and hmi.drms_series tables, and many insert commands. The insert commands will populate the hmi.m_45s table with a copy of all the records from the original corresponding table in the server database at the time of subscription. Solar data at the JSOC are stored in a system that comprises two PostgreSQL databases, multiple file systems, a tape back-up system, and software to manage these components. Related sets of data are grouped into data series, each, conceptually, a table of data where each row of data typically associated with an observation time, or a Carrington rotation. The columns contain metadata, such as the observation time, the ID of the camera used to acquire the data, the image rotation, etc. One column in this table contains an ID that refers to a set of data files, typically a set of FITS files that contain images.
Line 7: Line 7:
After the initial subscription completes (subscribe_series has completed successfully), the client has a static snapshot of the series at the time the command was issued. Changes to the original series on the server will not automatically be applied to the client's database. However, the server will regularly generate site-specific slony logs for the client. These logs contain insert (and delete) statements that, when applied by the client, will update the subscribed series tables so that, in effect, the client tables are kept synchronized with the corresponding server tables. The client must regularly download these log files and apply them to remain synchronized with the server. The Data Record Management System (DRMS) is the subsystem that contains and manages the "DRMS" database of metadata and data-file-locator information. One component is a software library, written in C, that provides client programs, also known as "DRMS modules", with an Application Programming Interface (API) that allows the users to access these data. The Storage Unit Management System (SUMS) is the subsystem that contains and manages the "SUMS" database and associated storage hardware. The database contains information needed to locate data files that reside on hardware. The entire system as a whole is typically referred to as DRMS. The user interfaces with the DRMS subsystem only, and the DRMS subsystem interfaces with SUMS - the user does not interact with SUMS directly. The JSOC provides a version of DRMS, NetDRMS, that can be deployed to non-JSOC institutions so that those sites can take advantage of the JSOC-developed software to manage large amounts of solar data.
Line 9: Line 9:
It is also possible to discontinue one or more subscriptions with the subscribe_series script. Site-specific logs generated after the completion of subscribe_series in this mode will no longer contain any statements relevant to the unsubscribed series. The un-subscription process does not remove from the client database the data for the series being removed. The series remain intact and function normally in every way; but they will become static snapshots of the server series at the time of un-subscription. Subsequent changes to the corresponding server will not propagate to the client. A NetDRMS site is an institution with a local NetDRMS installation. It does not generate the JSOC-owned production data series (e.g., hmi.M_720s, aia.lev1) that Stanford generates for scientific use. A NetDRMS site can generate its own data, production or otherwise. That site can create software that uses NetDRMS to generate its own data series. But it can also act as a "mirror" for individual data series. When acting as a mirror for a Stanford data series, the site downloads from Stanford DRMS database information and stores it in its own NetDRMS database, and it downloads SUMS files, and stores them in its own SUMS subsystem. As the data files are downloaded to the local SUMS, the SUMS database is updated with the information needed to manage the data files. It is possible for a NetDRMS site to mirror the DRMS data of any other NetDRMS site, but at this point, the only site whose data are currently mirrored is the Stanford JSOC.
Line 11: Line 11:
Before a client can run subscribe_series, several steps must be taken. First, the client must set up an ssh-agent session so that noninteractive ssh-based communications between the client and server can be established. To do this, follow the directions in [[SSHKeyNotes]], ensuring that you do create a passphrase. Second, the client must edit a configuration file that contains server and client information, such as the server's host machine and port, the server account that ssh uses, the path to the client's scp command, client directories to receive the slony logs, etc. We provide a documented template configuration file at base/drms/replication/etc/repclient.template.cfg (see the section entitled '''Configuration File''' below for more information). This configuration file is used for all client-side programs/scripts, including subscribe_series. Finally, the client must create a “list” file that lists series and the action to take on these series (e.g. subscribe or unsubscribe). We provide another documented template configuration file at base/drms/replication/etc/subscribe_list.template.cfg. In short, the list file contains two white-space separated columns. The first contains series names, and the second contains the text subscribe or unsubscribe. subscribe will cause the client to subscribe to the named series; unsubscribe will remove the named series from the set of subscribed series. == Mirroring Data ==
Line 13: Line 13:
After completing the set-up, the client operator runs subscribe_series by providing three arguments: the path to the configuration file ('''Configuration File'''), the path to the list file, and the path to the ssh-agent source file. A flag -r may also be applied in the event there was a recoverable error from a previous run of subscribe_series. If the -r flag is not provided, then all state files will be wiped clean before subscribe_series is run. If it is provided, then subscribe_series will attempt to continue from where it last left off. (A subscription process might be interrupted by a time-out or outage on the server, a power failure, or a client-side problem that has since been corrected.) The subscription process will complete with a message denoting success or failure. Please see the section entited '''Troubleshooting''' for information about troubleshooting subscription failures. In order for a NetDRMS site to mirror a data series generated at the JSOC, the data series must first by "published" at the JSOC. NetDRMS sites can mirror only data series that have been published at the JSOC. Then the mirroring site must "subscribe" to the data series. The subscription process is responsible for downloading the DRMS database information to the mirroring site. At the time of subscription, the existing DRMS database information is downloaded to the NetDRMS site and it is then ingested into the NetDRMS DRMS database. As changes are made to the data series at the JSOC, those changes are propagated to the mirroring site and ingested into the local DRMS database. JSOC developers implemented this publication/subscription feature by using Slony-I, PostgreSQL data-replication software (see Series Publication Notes), along with supporting scripts written at Stanford. Slony-I servers that run at the JSOC intercept changes to published data series and generate text files containing the SQL commands that update the DRMS database information at the mirroring site. The supporting scripts are responsible for dumping the database information at the time of subscription, and transferring it to, and ingesting it into, the NetDRMS site. The scripts are also responsible for downloading and ingesting the "update files" to the data series. The mirrored data series can be updated every couple of minutes so that the data series at the JSOC and the ones at the mirroring site are synchronized in near real time.

The update files generated by Slony-I contain SQL. For example, if the JSOC creates a new record of data for a published data series, which entails inserting a row in a DRMS database table, an update file will contain an SQL INSERT statement that, if executed, would duplicate the insertion performed at the JSOC. NetDRMS sites that have a subscription to this data series receive the update file containing the SQL INSERT statement and they ingest it (they run a psql - the postgreSQL front-end - command that blindly runs all SQL commands in the update file). As a comment on security, these update text files can contain SQL UPDATE and DELETE statements in them as well. A perl script runs at the mirroring site that looks for these update files at the JSOC, and then downloads them and runs psql to execute the SQL commands contained within the files.

The update files keep the mirroring NetDRMS site's DRMS database data series information synchronized with the DRMS database information at the JSOC. But they do not synchronize the SUMS data files. A separate system is used to download the data files for newly inserted DRMS records - Remote SUMS (RSUMS). RSUMS is client-server architecture that operates in two modes: 1. RSUMS can be used to pre-fetch SUMS data files. In this mode, as the mirror NetDRMS ingests the DRMS database information for new data series rows, a database trigger inserts a row in the database table drms.ingested_sunums. The RSUMS client daemon then "sees" a new row in this table and begins the RSUMS download process; 2. RSUMS can be used to fetch SUMS data files as needed, in an on-demand fashion. In this mode, RSUMS operation is triggered only if the mirroring NetDRMS site attempts to use DRMS to access the data file. DRMS will observe that the needed data file is not present, so it will in response initiate the RSUMS download process.

== Subscribing to a Data Series ==

To subscribe to a data series, a privileged operator at the mirroring NetDRMS site runs the subscribe.py client script. This script, which can be run as many times as desired, allows the operator to subscribe to one or more series at a time. The first time it is run, the subscription server (running at the site running Slony-I) creates a directory for the site into which site-specific Slony-I update files will reside. These files will contain statements germane only to the set of series to which the site is subscribed.

In brief, the subscription server satisfies a client subscription request by sending to the client a file containing a SQL dump of the database information for the data series to which the client is subscribing. The file will also contain other set-up commands depending on the status of the client as a first-time subscriber or not. For example, if a client requests a subscription to hmi.M_45s, then the dump file will contain commands to create the hmi namespace (if it does not already exist), the hmi.m_45s table (it must not exist at subscription time), the appropriate entries in the hmi.drms_keyword, hmi.drms_segment, hmi.drms_link, and hmi.drms_series tables, and many insert commands. The insert commands will populate the hmi.m_45s table with a copy of all the records from the original corresponding table in the server database at the time of subscription.

After the initial subscription completes (subscribe.py has completed successfully), the client has a static snapshot of the series at the time the command was issued. Changes to the original series on the server will not automatically be applied to the client's database. However, the server will regularly generate site-specific Slony-I update files for the client. These update files contain INSERT (and UPDATE and DELETE) statements that, when applied by the client, will update the subscribed-series tables so that, in effect, the client tables are kept synchronized with the corresponding server tables. The client must regularly download these log files and apply them to remain synchronized with the server.

It is also possible to discontinue one or more subscriptions with the subscribe.py script. Site-specific update files generated after the completion of subscribe.py in this mode will no longer contain any statements relevant to the data series dropped from subscription. The un-subscription process does not remove from the client DRMS database the data for the data series being removed. The data series remain intact and function normally in every way; but they will become static snapshots of the server series at the time of un-subscription. Subsequent changes to the corresponding server data series will not propagate to the client.

Before a client can run subscribe.py, several steps must be taken. First, the client must set up an ssh-agent session so that noninteractive ssh-based communications between the client and server can be established. To do this, follow the directions in [[SSHKeyNotes]], ensuring that you do create a passphrase. Second, the client must edit a configuration file that contains server and client information, such as the server's host machine and port, the server account that ssh uses, the path to the client's scp command, client directories to receive the slony logs, etc. We provide a documented template configuration file at base/drms/replication/etc/repclient.template.cfg (see the section entitled '''Configuration File''' below for more information). This configuration file is used for all client-side programs/scripts, including subscribe.py.

After completing the set-up, the client operator runs subscribe.py by providing three arguments: the path to the configuration file ('''Configuration File'''), the request type ('subscribe', 'resubscribe', or 'unsubscribe'), and the data series (e.g., hmi.V_45s). subscribe.py makes several CGI calls to the JSOC: before initiating the subscription process for a data series, it checks to see if the data series is in fact published at the JSOC. To do so, subscribe.py opens the AJAX publist service URL at the JSOC, asking if the data series is published. subscribe.py also uses the same service to determine if the NetDRMS site has an active subscription for the data series. If a subscription already exists, then subscribe.py skips the subscription, notifying the user. subscribe.py invokes a second CGI to initiate the subscription process: the subscription service. It first checks to see if the NetDRMS has already started the subscription process for the data series, and if not, it then makes a subscription request. The request entails HTTP interaction between the subscription server at the JSOC, and subscribe.py at the NetDRMS mirroring site. SQL dump files are iteratively created at the JSOC and transferred to the NetDRMS site via HTTP download. At the conclusion of the process, subscribe.py terminates and provides a message indicating success or failure.
Line 17: Line 35:
 * {{{node}}} - the name of the subscription client (e.g., {{{jsoc}}}, {{{nsocu}}}, {{{sdac}}}. Must be globally unique across all NetDRMS sites; this string will be used in various state files and in file/directory names.
 * {{{kRSServer}}} - the full domain name of the subscription log server (e.g., {{{jsocport.stanford.edu}}}).
 * {{{kRSUser}}} - the account on {{{kRSServer}}} that will be used for data transfer (e.g., {{{jsocexp}}}).
 * {{{kRSPort}}} - the port on {{{kRSServer}}} that will be used for data transfer (e.g., {{{22}}} for {{{scp}}}).
 * {{{pg_host}}} - the client machine that hosts the client PostgreSQL database that will contain the replicated data series - this is <PostgreSQL host>.
 * {{{pg_port}}} - the port on the {{{pg_host}}} machine that will be used for communication with the data-series database.
 * {{{pg_user}}} - the PostgreSQL user that will own the replicated series (e.g., <NetDRMS production user>)
 * {{{pg_dbname}}} - the name of the PostgreSQL database that resides on {{{pg_host}}} (e.g., {{{netdrms}}})
 * {{{slony_cluster}}} - the name of the Slony cluster to which this node belongs (e.g., {{{jsoc}}} for a client subscriging to series published at the JSOC)
 * {{{kLocalLogDir}}} - the client directory that will contain the subscription-process logs.
 * {{{kSQLIngestionProgram}}} - the path to the script/program that will ingest the site-specific slony logs — usually the path to get_slony_logs.pl (e.g., {{{<NetDRMS root>/base/drms/replication/get_slony_logs.pl}}}
 * {{{kDeleteSeriesProgram}}} - the path to the program {{{delete_series}}}, which is used to delete DRMS data series on the client when requested.
 * {{{ingestion_path}}} - the local directory that will contain the ingestion "die" file — used by get_slony_logs.pl
 * {{{scp_cmd}}} - the absolute path to the client's scp program.
 * {{{ssh_cmd}}} - the absolute path to the client's ssh program.
 * {{{rmt_slony_dir}}} - the absolute path, accessible from the {{{kRSUser}}} account, on the server to the directory that contains the site-specific slony logs (/data/pgsql/slon_logs/live/site_logs).
 * {{{slony_logs}}} - the client directory that contains the downloaded site-specific slony logs.
 * {{{PSQL}}} - the path to the client's {{{psql}}} program, and any flags needed to run psql as the pg_user user, like -h {{{pg_host}}}.
 * {{{email_list}}} - the email account to which error messages will be sent.
Line 18: Line 55:
node
The name of the subscription client. Must be globally unique across all NetDRMS sites. This string will be used in various state files and in file/directory names.
kRSServer
The full domain name of the server (solarport.stanford.edu)
kRSUser
The account on kRSServer that will be used for data transfer (jsocexp)
kRSTriggerDir
The directory accessible from the kRSUser account where data files will be staged (/data/pgsql/slon_logs/live/triggers/)
kRSPort
The port on kRSServer that will be used for data transfer (55000)
pg_host
The client machine that hosts the client PostgreSQL database that will contain the replicated data series
pg_port
The port on the pg_host machine that will be used for communication with the data-series database
pg_user
The PostgreSQL user that will own the replicated series (slony)
pg_dbname
The name of the PostgreSQL database that resides on pg_host
slony_cluster
The name of the Slony cluster to which this node belongs (jsoc)
kLocalLogDir
The client directory that will contain log files
kLocalWorkingDir
The client directory that will contain temporary working files
kSQLIngestionProgram
The path to the script/program that will ingest the site-specific slony logs — usually the path to get_slony_logs.pl
kDeleteSeriesProgram
The path to the program delete_series used to delete DRMS data series on the client
attempts
The number of attempts that the client should make when looking for the flag file indicating the sql file is ready on the server
ingestion_path
The local directory that will contain the ingestion "die" file — used by get_slony_logs.pl
scp_cmd
The absolute path to the client's scp program.
ssh_cmd
The absolute path to the client's ssh program.
rmt_slony_dir
The absolute path, accessible from the kRSUser account, on the server to the directory that contains the site-specific slony logs (/data/pgsql/slon_logs/live/site_logs)
slony_logs
The client directory that will contain the downloaded site-specific slony logs
PSQL
The path to the client's psql program, and any flags needed to run psql as the pg_user user, like -h pg_host
email_list
The email account to which error messages will be sent

== Troubleshooting ==

There are a number of problems that could occur during the subscription process. The client might not have a "clean" environment; there might be an out-of-date _jsoc namespace (used internally by Slony); the series being subscribed to may exist in some form already; there might be a bad parameter in the configuration file; or there might be directory permission issues, etc. The server might also have an unclean environment. Often times it is easiest to clean up both environments and try again as a new subscriber (please see the section entitled '''Cleaning Up''' below).
Line 68: Line 57:
To clean the subscription client environment: To clean the subscription client environment, the following must be performed by the database user identified by the pg_user parameter of the configuration file (if you have used the suggested user, then this is 'slony':
Line 71: Line 60:
{{{
% delete_series [ -k ] <series> JSOC_DBUSER=slony
}}}
Line 72: Line 64:
{{{
% psql -h <db server host> -p 5432 -U slony data
data=# DROP SCHEMA <schema> CASCADE;
}}}
Line 77: Line 73:
{{{
% psql -h <db server host> -p 5432 -U slony data
data=# DROP SCHEMA _jsoc CASCADE;
}}}
Line 78: Line 78:
To drop a subscriber from server environment:
 * Disable the Slony-log parser (the best way to do this is to comment out the cron tab for parse_slon_logs.pl)
 * Edit slon_parser.cfg to remove the subscriber's row
 * Remove the subscriber's row from the su_production.slonycfg table:
{{{
jsoc=# DELETE FROM su_production.slonycfg WHERE node='<subscriber node code>';
}}}
 * Remove the subscriber's *.lst file(s) [ there could be a *.new.lst file present if a subscription failed part way ]
 * Remove the subscriber's row from the su_production.slonylst table:
{{{
jsoc=# DELETE FROM su_production.slonylst WHERE node='<subscriber node code>';
}}}
Line 79: Line 91:
To clean the subscription server environment:

 * Remove the site-specific state and content files from the trigger directory (identified by the kRSTriggerDir configuration parameter).
 * Remove all site-specific *.lst and *.new.lst files.
 * Remove the site-specific log-file directory.
 * Remove all site-specific entries in the slon_parser.cfg state file.

=== Re-subscribing ===
You might occasionally need to re-subscribe to a series. There could be bugs at the JSOC or in your site's NetDRMS that result in your site missing some DRMS records (or having extra records). An easy way to reconcile these differences is to unsubscribe from the series, then to re-subscribe to the series. Right after you re-subscribe, your mirrored series will match the JSOC version exactly.

If you unsubscribe from a series by running the appropriate subscribe_series command:
{{{
subscribe_series <path>/repclient.cfg <path>/subscribe_list.cfg <path>/.ssh-agent_rs
}}}
where subscribe_list.cfg contains this line:
{{{
hmi.someseries unsubscribe
}}}
''and'' you answer 'Y' to this prompt:
{{{
Would you like to delete series hmi.someseries? (Y/N)
}}}
then subscribe_series will delete all SUMS Storage Units (SUs) that you have ingested into your SUMS. subscribe_series deletes the series and all associated SUs by calling a program, '''delete_series''', that is part of your NetDRMS. But, since you are going to want to re-subscribe to the series, it would be nice to keep the SUs in your SUMS so that they survive the re-subscription process and do not need to be re-downloaded. So, if your goal is to keep the SUs, but re-subscribe to a series, do not use this method as-is.

There are three methods that can be used to unsubscribe a series without deleting the SUs until I can add a new options that causes SUs to be preserved:
 1.Manually edit your subscribe_series script. In the DeleteSeries() function, modify the following line:
{{{
echo "yes"$'\n'"yes"$'\n' | "$delseriesprog" "$series" JSOC_DBUSER=slony
}}}
to
{{{
echo "yes"$'\n'"yes"$'\n' | "$delseriesprog" "-k" "$series" JSOC_DBUSER=slony
}}}
Add a "-k" (lower-case k) to the command line to the program delete_series. After you do this, then you can run subscribe_series with an unsubscribe line in the subscribe_list.cfg file. Remember to revert your subscribe_series program back to the way it was when you have finished unsubscribing.
 1.Run subscribe_series with an unsubscribe parameter in the subscribe_list.cfg file as you normally would, BUT answer "N" to this prompt:
{{{
Would you like to delete series hmi.someseries? (Y/N)
}}}
Then after the unsubscription process has completed, run this program:
{{{
> delete_series -k hmi.someseries JSOC_DBUSER=slony
}}}
 1.Run the delete_series command ahead of time. There is a bit of a timing issue involved, so this method may not work if you happen to run it at just the wrong time. But recovering from an unlucky timing is easy. Run these commands in order:
{{{
> get_slony_logs.pl <path>/repclient.cfg # This is probably running via a cron job - force a run of it.
> delete_series -k hmi.someseries JSOC_DBUSER=slony
> subscribe_series repclient.cfg subscribe_list.cfg <path>/.ssh-agent_rs # an unsubscribe command for hmi.someseries is in the subscribe_list.cfg file
}}}
If a new Slony log gets generated for you after you run get_slony_logs.pl, but before your subscribe_series command completes, then you may receive a Slony log with an insert statement for the series you just deleted, and then your next get_slony_logs.pl command would fail. The chances of this happening are small, but not zero.

At this point, you will have successfully unsubscribed from hmi.someseries, but you will still have the SUs associated with that series in your SUMS. The next step is to subscribe to the series you just unsubscribed from:
{{{
> subscribe_series <path>/repclient.cfg <path>/subscribe_list.cfg <path>/.ssh-agent_rs
}}}
where subscribe_list.cfg contains this line:
{{{
hmi.someseries subscribe
}}}
Finally, if you use the JMD, you will need to set-up the hmi.someseries database table so that the JMD knows to fetch SUs for newly ingested records. To do that, run:
{{{
> sunum_queue_trigger_sampled.pl <arguments>
}}}
Unfortunately, I have never run this program so I do not which arguments should be provided, but the good news is that since you had the JMD fetching SUs for hmi.someseries before the unsubscription occurred, you ran sunum_queue_trigger_sampled.pl before. Use the same command you did the last time you subscribed to hmi.someseries.
 * Delete the subscriber's log-file directory in /c/pgsql/slon_logs/live/site_logs
 * Enable the Slony-log parser

How To Subscribe to Data Series Created and Maintained by the JSOC-SDP

Background

Solar data at the JSOC are stored in a system that comprises two PostgreSQL databases, multiple file systems, a tape back-up system, and software to manage these components. Related sets of data are grouped into data series, each, conceptually, a table of data where each row of data typically associated with an observation time, or a Carrington rotation. The columns contain metadata, such as the observation time, the ID of the camera used to acquire the data, the image rotation, etc. One column in this table contains an ID that refers to a set of data files, typically a set of FITS files that contain images.

The Data Record Management System (DRMS) is the subsystem that contains and manages the "DRMS" database of metadata and data-file-locator information. One component is a software library, written in C, that provides client programs, also known as "DRMS modules", with an Application Programming Interface (API) that allows the users to access these data. The Storage Unit Management System (SUMS) is the subsystem that contains and manages the "SUMS" database and associated storage hardware. The database contains information needed to locate data files that reside on hardware. The entire system as a whole is typically referred to as DRMS. The user interfaces with the DRMS subsystem only, and the DRMS subsystem interfaces with SUMS - the user does not interact with SUMS directly. The JSOC provides a version of DRMS, NetDRMS, that can be deployed to non-JSOC institutions so that those sites can take advantage of the JSOC-developed software to manage large amounts of solar data.

A NetDRMS site is an institution with a local NetDRMS installation. It does not generate the JSOC-owned production data series (e.g., hmi.M_720s, aia.lev1) that Stanford generates for scientific use. A NetDRMS site can generate its own data, production or otherwise. That site can create software that uses NetDRMS to generate its own data series. But it can also act as a "mirror" for individual data series. When acting as a mirror for a Stanford data series, the site downloads from Stanford DRMS database information and stores it in its own NetDRMS database, and it downloads SUMS files, and stores them in its own SUMS subsystem. As the data files are downloaded to the local SUMS, the SUMS database is updated with the information needed to manage the data files. It is possible for a NetDRMS site to mirror the DRMS data of any other NetDRMS site, but at this point, the only site whose data are currently mirrored is the Stanford JSOC.

Mirroring Data

In order for a NetDRMS site to mirror a data series generated at the JSOC, the data series must first by "published" at the JSOC. NetDRMS sites can mirror only data series that have been published at the JSOC. Then the mirroring site must "subscribe" to the data series. The subscription process is responsible for downloading the DRMS database information to the mirroring site. At the time of subscription, the existing DRMS database information is downloaded to the NetDRMS site and it is then ingested into the NetDRMS DRMS database. As changes are made to the data series at the JSOC, those changes are propagated to the mirroring site and ingested into the local DRMS database. JSOC developers implemented this publication/subscription feature by using Slony-I, PostgreSQL data-replication software (see Series Publication Notes), along with supporting scripts written at Stanford. Slony-I servers that run at the JSOC intercept changes to published data series and generate text files containing the SQL commands that update the DRMS database information at the mirroring site. The supporting scripts are responsible for dumping the database information at the time of subscription, and transferring it to, and ingesting it into, the NetDRMS site. The scripts are also responsible for downloading and ingesting the "update files" to the data series. The mirrored data series can be updated every couple of minutes so that the data series at the JSOC and the ones at the mirroring site are synchronized in near real time.

The update files generated by Slony-I contain SQL. For example, if the JSOC creates a new record of data for a published data series, which entails inserting a row in a DRMS database table, an update file will contain an SQL INSERT statement that, if executed, would duplicate the insertion performed at the JSOC. NetDRMS sites that have a subscription to this data series receive the update file containing the SQL INSERT statement and they ingest it (they run a psql - the postgreSQL front-end - command that blindly runs all SQL commands in the update file). As a comment on security, these update text files can contain SQL UPDATE and DELETE statements in them as well. A perl script runs at the mirroring site that looks for these update files at the JSOC, and then downloads them and runs psql to execute the SQL commands contained within the files.

The update files keep the mirroring NetDRMS site's DRMS database data series information synchronized with the DRMS database information at the JSOC. But they do not synchronize the SUMS data files. A separate system is used to download the data files for newly inserted DRMS records - Remote SUMS (RSUMS). RSUMS is client-server architecture that operates in two modes: 1. RSUMS can be used to pre-fetch SUMS data files. In this mode, as the mirror NetDRMS ingests the DRMS database information for new data series rows, a database trigger inserts a row in the database table drms.ingested_sunums. The RSUMS client daemon then "sees" a new row in this table and begins the RSUMS download process; 2. RSUMS can be used to fetch SUMS data files as needed, in an on-demand fashion. In this mode, RSUMS operation is triggered only if the mirroring NetDRMS site attempts to use DRMS to access the data file. DRMS will observe that the needed data file is not present, so it will in response initiate the RSUMS download process.

Subscribing to a Data Series

To subscribe to a data series, a privileged operator at the mirroring NetDRMS site runs the subscribe.py client script. This script, which can be run as many times as desired, allows the operator to subscribe to one or more series at a time. The first time it is run, the subscription server (running at the site running Slony-I) creates a directory for the site into which site-specific Slony-I update files will reside. These files will contain statements germane only to the set of series to which the site is subscribed.

In brief, the subscription server satisfies a client subscription request by sending to the client a file containing a SQL dump of the database information for the data series to which the client is subscribing. The file will also contain other set-up commands depending on the status of the client as a first-time subscriber or not. For example, if a client requests a subscription to hmi.M_45s, then the dump file will contain commands to create the hmi namespace (if it does not already exist), the hmi.m_45s table (it must not exist at subscription time), the appropriate entries in the hmi.drms_keyword, hmi.drms_segment, hmi.drms_link, and hmi.drms_series tables, and many insert commands. The insert commands will populate the hmi.m_45s table with a copy of all the records from the original corresponding table in the server database at the time of subscription.

After the initial subscription completes (subscribe.py has completed successfully), the client has a static snapshot of the series at the time the command was issued. Changes to the original series on the server will not automatically be applied to the client's database. However, the server will regularly generate site-specific Slony-I update files for the client. These update files contain INSERT (and UPDATE and DELETE) statements that, when applied by the client, will update the subscribed-series tables so that, in effect, the client tables are kept synchronized with the corresponding server tables. The client must regularly download these log files and apply them to remain synchronized with the server.

It is also possible to discontinue one or more subscriptions with the subscribe.py script. Site-specific update files generated after the completion of subscribe.py in this mode will no longer contain any statements relevant to the data series dropped from subscription. The un-subscription process does not remove from the client DRMS database the data for the data series being removed. The data series remain intact and function normally in every way; but they will become static snapshots of the server series at the time of un-subscription. Subsequent changes to the corresponding server data series will not propagate to the client.

Before a client can run subscribe.py, several steps must be taken. First, the client must set up an ssh-agent session so that noninteractive ssh-based communications between the client and server can be established. To do this, follow the directions in SSHKeyNotes, ensuring that you do create a passphrase. Second, the client must edit a configuration file that contains server and client information, such as the server's host machine and port, the server account that ssh uses, the path to the client's scp command, client directories to receive the slony logs, etc. We provide a documented template configuration file at base/drms/replication/etc/repclient.template.cfg (see the section entitled Configuration File below for more information). This configuration file is used for all client-side programs/scripts, including subscribe.py.

After completing the set-up, the client operator runs subscribe.py by providing three arguments: the path to the configuration file (Configuration File), the request type ('subscribe', 'resubscribe', or 'unsubscribe'), and the data series (e.g., hmi.V_45s). subscribe.py makes several CGI calls to the JSOC: before initiating the subscription process for a data series, it checks to see if the data series is in fact published at the JSOC. To do so, subscribe.py opens the AJAX publist service URL at the JSOC, asking if the data series is published. subscribe.py also uses the same service to determine if the NetDRMS site has an active subscription for the data series. If a subscription already exists, then subscribe.py skips the subscription, notifying the user. subscribe.py invokes a second CGI to initiate the subscription process: the subscription service. It first checks to see if the NetDRMS has already started the subscription process for the data series, and if not, it then makes a subscription request. The request entails HTTP interaction between the subscription server at the JSOC, and subscribe.py at the NetDRMS mirroring site. SQL dump files are iteratively created at the JSOC and transferred to the NetDRMS site via HTTP download. At the conclusion of the process, subscribe.py terminates and provides a message indicating success or failure.

Configuration File

The configuration file contains configuration information needed by several client-side programs and scripts (subscribe_series, get_slony_logs.pl). It consists of several key-value pairs separated by commas (values appropriate when the Stanford JSOC is the server are included in parentheses):

  • node - the name of the subscription client (e.g., jsoc, nsocu, sdac. Must be globally unique across all NetDRMS sites; this string will be used in various state files and in file/directory names.

  • kRSServer - the full domain name of the subscription log server (e.g., jsocport.stanford.edu).

  • kRSUser - the account on kRSServer that will be used for data transfer (e.g., jsocexp).

  • kRSPort - the port on kRSServer that will be used for data transfer (e.g., 22 for scp).

  • pg_host - the client machine that hosts the client PostgreSQL database that will contain the replicated data series - this is <PostgreSQL host>.

  • pg_port - the port on the pg_host machine that will be used for communication with the data-series database.

  • pg_user - the PostgreSQL user that will own the replicated series (e.g., <NetDRMS production user>)

  • pg_dbname - the name of the PostgreSQL database that resides on pg_host (e.g., netdrms)

  • slony_cluster - the name of the Slony cluster to which this node belongs (e.g., jsoc for a client subscriging to series published at the JSOC)

  • kLocalLogDir - the client directory that will contain the subscription-process logs.

  • kSQLIngestionProgram - the path to the script/program that will ingest the site-specific slony logs — usually the path to get_slony_logs.pl (e.g., <NetDRMS root>/base/drms/replication/get_slony_logs.pl

  • kDeleteSeriesProgram - the path to the program delete_series, which is used to delete DRMS data series on the client when requested.

  • ingestion_path - the local directory that will contain the ingestion "die" file — used by get_slony_logs.pl

  • scp_cmd - the absolute path to the client's scp program.

  • ssh_cmd - the absolute path to the client's ssh program.

  • rmt_slony_dir - the absolute path, accessible from the kRSUser account, on the server to the directory that contains the site-specific slony logs (/data/pgsql/slon_logs/live/site_logs).

  • slony_logs - the client directory that contains the downloaded site-specific slony logs.

  • PSQL - the path to the client's psql program, and any flags needed to run psql as the pg_user user, like -h pg_host.

  • email_list - the email account to which error messages will be sent.

Cleaning Up

To clean the subscription client environment, the following must be performed by the database user identified by the pg_user parameter of the configuration file (if you have used the suggested user, then this is 'slony':

  • Remove all subscribed-to series. Use delete_series to delete the series - the -k flag might be appropriate; if set, the SUMS data file associated with the series will not be deleted.

% delete_series [ -k ] <series> JSOC_DBUSER=slony
  • Remove the schemas of the subscribed-to series that had existed.

% psql -h <db server host> -p 5432 -U slony data
data=# DROP SCHEMA <schema> CASCADE;
  • Remove the records from admin.ns for the schemas just deleted

DELETE FROM admin.ns WHERE name = '<schema>';
  • Remove the _jsoc schema.

% psql -h <db server host> -p 5432 -U slony data
data=# DROP SCHEMA _jsoc CASCADE;

To drop a subscriber from server environment:

  • Disable the Slony-log parser (the best way to do this is to comment out the cron tab for parse_slon_logs.pl)
  • Edit slon_parser.cfg to remove the subscriber's row
  • Remove the subscriber's row from the su_production.slonycfg table:

jsoc=# DELETE FROM su_production.slonycfg WHERE node='<subscriber node code>';
  • Remove the subscriber's *.lst file(s) [ there could be a *.new.lst file present if a subscription failed part way ]
  • Remove the subscriber's row from the su_production.slonylst table:

jsoc=# DELETE FROM su_production.slonylst WHERE node='<subscriber node code>';
  • Delete the subscriber's log-file directory in /c/pgsql/slon_logs/live/site_logs
  • Enable the Slony-log parser

JsocWiki: SeriesSubscription (last edited 2020-01-17 06:03:19 by ArtAmezcua)