NetDRMS - a shared data management system

Introduction

In order to process, archive, and distribute the substantial quantity of solar data captured by the Atmospheric Imaging Assembly (AIA) and Helioseismic and Magnetic Imager (HMI) instruments on the Solar Dynamics Observatory (SDO), the Joint Science Operations Center (JSOC) has developed its own data-management system, NetDRMS. This system comprises two PostgreSQL databases, multiple file systems, a tape back-up system, and software to manage these components. Related sets of data are grouped into data series, each, conceptually, a table of data where each row of data typically associated with an observation time, or a Carrington rotation. As an example, the data series hmi.M_45s contains the HMI 45-second cadence magnetograms, both observation metadata and image FITS files. The columns contain metadata, such as the observation time, the ID of the camera used to acquire the data, the image rotation, etc. One column in this table contains an ID that refers to a set of data files, typically a set of FITS files that contain images.

The Data Record Management System (DRMS) is the subsystem that contains and manages the "DRMS" database of metadata and data-file-locator information. One component is a software library, written in C, that provides client programs, also known as "DRMS modules", with an Application Programming Interface (API) that allows the users to access these data. The Storage Unit Management System (SUMS) is the subsystem that contains and manages the "SUMS" database and associated storage hardware. The database contains information needed to locate data files that reside on hardware. The entire system as a whole is typically referred to as DRMS. The user interfaces with the DRMS subsystem only, and the DRMS subsystem interfaces with SUMS - the user does not interact with SUMS directly. The JSOC provides NetDRMS to non-JSOC institutions so that those sites can take advantage of the JSOC-developed software to manage large amounts of solar data.

A NetDRMS site is an institution with a local NetDRMS installation. It does not generate the JSOC-owned production data series (e.g., hmi.M_720s, aia.lev1) that Stanford generates for scientific use. A NetDRMS site can generate its own data, production or otherwise. That site can create software that uses NetDRMS to generate its own data series. But it can also act as a "mirror" for individual data series. When acting as a mirror for a Stanford data series, the site downloads from Stanford DRMS database information and stores it in its own NetDRMS database, and it downloads SUMS files, and stores them in its own SUMS subsystem. As the data files are downloaded to the local SUMS, the SUMS database is updated with the information needed to manage the data files. It is possible for a NetDRMS site to mirror the DRMS data of any other NetDRMS site, but at this point, the only site whose data are currently mirrored is the Stanford JSOC.

Installing NetDRMS

Installing the NetDRMS system requires:

Optional steps include:

For best results, and to facilitate debugging issues, please follow these steps in order.

Installing PostgreSQL

PostgreSQL is a relational database management system. Data are stored primarily in relations (tables) of records that can be mapped to each other - given one or more records, you can query the database to find other records. These relations are organized on disk in a hierarchical fashion. At the top level are one or more database clusters. A cluster is simply a storage location on disk (i.e., directory). PostgreSQL manages the cluster's data files with a single process, or PostgreSQL instance. Various operations on the cluster will result in PostgreSQL forking new ephemeral child processes, but ultimately there is only one master/parent process per cluster.

Each cluster contains the data for one or more databases. Each cluster requires a fair amount of system memory, so it makes sense to install a single cluster on a single host. It does not make sense to make separate clusters, each holding one database; each cluster can efficiently support many databases, which are then fairly independent of each other. In terms of querying the databases are completely independent (i.e., a query on one database cannot involve relations in different databases). However, two databases in a single cluster do share the same disk directory, so there is not the same degree of independence at the OS/filesystem level. This may only matter if an administrator is operating directly on the files (performing backups, replication, creating standby systems, etc.).

To install PostgreSQL, select a host machine, <PostgreSQL host>, to act as the PostgreSQL database server. We recommend installing only PostgreSQL on this machine, given the large amount of memory and resources required for optimal PostgreSQL operation. We find a Fedora-based system, such as CentOS, to be a good choice, but please visit https://www.postgresql.org/docs for system requirements and other information germane to installation. The following instructions assume a Fedora-based Linux system such as CentOS (documentation for other distributions, such as Debian and openSUSE can be found online) and a bash shell.

Install the needed PostgreSQL server packages on <PostgreSQL host> by first visiting https://yum.postgresql.org/repopackages.php to locate and download the PostgreSQL "repo" rpm file appropriate for your OS and architecture. Each repo rpm contains a yum configuration file that can be used to install all supported PostgreSQL releases. You should install the latest version if possible (version 12, as of the time of this writing). Although you can use your browser to download the file, it might be easier to use Linux command-line tools:

$ curl -OL https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

Install the yum repo configuration file (pgdg-redhat-all.repo) from the downloaded repo rpm file:

$ sudo rpm -i pgdg-redhat-repo-latest.noarch.rpm

This installs the repo configuration file to /etc/yum.repos.d/. Find the names of the PostgreSQL packages needed from the repository; the following assumes PostgreSQL 12, but should you want to install an older version, replace "12" with one of 94, 95, 96, 10, or 11:

$ yum list --disablerepo='*' --enablerepo=pgdg12 2>/dev/null | grep -Eo '^.*postgresql[0-9]*\.' | cut -d '.' -f 1
postgresql12
$ yum list --disablerepo='*' --enablerepo=pgdg12 2>/dev/null | grep -Eo '^.*postgresql.*devel\.' | cut -d '.' -f 1 
postgresql12-devel
$ yum list --disablerepo='*' --enablerepo=pgdg12 2>/dev/null | grep -Eo '^.*postgresql.*contrib\.' | cut -d '.' -f 1
postgresql12-contrib
$ yum list --disablerepo='*' --enablerepo=pgdg12 2>/dev/null | grep -Eo '^.*postgresql.*libs\.' | cut -d '.' -f 1 
postgresql12-libs
$ yum list --disablerepo='*' --enablerepo=pgdg12 2>/dev/null | grep -Eo '^.*postgresql.*plperl\.' | cut -d '.' -f 1 
postgresql12-plperl
$ yum list --disablerepo='*' --enablerepo=pgdg12 2>/dev/null | grep -Eo '^.*postgresql.*server\.' | cut -d '.' -f 1 
postgresql12-server

Use yum to install all four packages:

$ sudo yum install <packages>

where <packages> are the package names determined in the previous step (postgresql12 postgresql12-contrib postgresql12-devel postgresql12-libs postgresql12-plperl postgresql12-server). The rpm package installation will have created the PostgreSQL superuser Linux account <PostgreSQL superuser> (i.e., postgres); <PostgreSQL superuser> will own the PostgreSQL database clusters and server processes that will be created in the following steps. To perform the next steps, you will need to become user <PostgreSQL superuser>:

$ sudo su - <PosgreSQL superuser>

Depending on where the package files are installed, you might need to add the PostgreSQL command to your PATH environment variable. To test this, run:

$ which initdb

If the initdb command cannot be found, then add the PostgreSQL binaries path to PATH. Find the path to the PostgreSQL installation:

$ rpm -ql postgresql12
<PostgreSQL install dir>/bin/clusterdb
...

In this example, <PostgreSQL install dir> is /usr/pgsql-12. Then add the binary path to PATH:

$ export PATH=/usr/pgsql-12/bin:$PATH

<PosgreSQL superuser> will be using the binaries in this directory, so it is a good idea to add the export command to .bashrc. As described above, create one database cluster for the two databases (one for DRMS data, and one for SUMS data):

$ whoami
postgres
$ initdb --locale=C -D <PostgreSQL cluster>

where <PostgreSQL cluster> should be /var/lib/pgsql/netdrms. Use this path, unless there is some good reason you cannot. initdb will initialize the cluster data directory (identified by the -D argument). This will result in the creation of template databases, configuration files, and other items.

The database cluster will contain two configuration files you need to edit: postgresql.conf and pghba.conf. Please refer to the PostgreSQL documentation to properly edit these files. Here are some brief suggestions:

Here are the recommended entries:

Should you need to edit either of these configuration files AFTER you have started the database instance (by running pg_ctl start, as described in the next section), you will need to either reload or restart the instance:

$ whoami
postgres
# reload
$ pg_ctl reload -D <PostgreSQL cluster>
# restart
$ pg_ctl restart -D <PostgreSQL cluster>

Initializing PostgreSQL

You need to now initialize your PostgreSQL instance by creating the DRMS and SUMS databases, installing database-server languages, creating a schema, creating a relation. To accomplish this become <PostgreSQL superuser>; all steps in this section must be performed by the superuser:

$ sudo su - postgres

Start the database instance for the cluster you created:

$ whoami
postgres
$ pg_ctl start -D <PostgreSQL cluster>

You previous created <PostgreSQL cluster>, which will most likely be /var/lib/pgsql/netdrms. Ensure the configuration files you created work. This can be done by attempting to connect to the database server as <PostgreSQL superuser> with psql from <PostgreSQL host>:

$ whoami
postgres
$ ssh <PostgreSQL host>
$ psql
psql (12.1)
Type "help" for help.

postgres=# 

You should not see any errors, and you should see the postgres=# superuser psql prompt. After you encounter no errors, create the two databases:

$ whoami
postgres
# create the DRMS database
$ createdb --locale C -E UTF8 -T template0 netdrms
# create the SUMS database
$ createdb --locale C -E UTF8 -T template0 netdrms_sums

Install the required database-server languages:

$ whoami
postgres
# create the PostgreSQL scripting language (versions <= 9.6)
# no need to create the PostgreSQL scripting language (versions > 9.6)
$ createlang plpgsql netdrms
# create the "trusted" perl language (versions <= 9.6)
createlang -h <PostgreSQL host> plperl netdrms
# create the "trusted" perl language (versions > 9.6)
$ psql netdrms
netdrms=# CREATE EXTENSION IF NOT EXISTS plperl;
netdrms=# \q
# create the "untrused" perl language (versions <= 9.6)
$ createlang -h <PostgreSQL host> plperlu netdrms
# create the "untrused" perl language (versions > 9.6)
netdrms=# CREATE EXTENSION IF NOT EXISTS plperlu;
netdrms=# \q

The SUMS database does not use any language extensions so there is no need to create any for the SUMS database.

Installing CFITSIO

The base NetDRMS release requires CFITSIO, a C library used by NetDRMS to read and write FITS files. Visit https://heasarc.gsfc.nasa.gov/fitsio/ to obtain the link to the CFITSIO source-code tarball. Create an installation directory <CFITSIO install dir> (such as /opt/cfitsio-X.XX, with a link from /opt/cfitsio to <CFITSIO install dir>), download the tarball, and extract the tarball into <CFITSIO install dir>:

$ sudo mkdir -p <CFITSIO install dir>
$ ctrl+d

$ curl -OL 'http://heasarc.gsfc.nasa.gov/FTP/software/fitsio/c/cfitsio-X.XX.tar.gz'
$ tar xvzf netdrms_X.X.tar.gz
$ cd cfitsio-X.XX

Please read the README file for complete installation instructions. As a quick start, run:

$ ./configure --prefix=<CFITSIO install dir>
# build the CFITSIO library
$ make
# install the libraries and binaries to <CFITSIO install dir>
$ sudo make install
# create the link from cfitsio to <CFITSIO install dir>
$ sudo su -
$ cd <CFITSIO install dir>/..
$ ln -s <CFITSIO install dir> cfitsio

To link NetDRMS code against CFITSIO, sudo yum install libcurl-devel

Installing OpenSSL Development Packages

NetDRMS requires the OpenSSL Developer's API. If this API has not already been installed, do so now:

$ sudo yum install openssl-devel

Installing Python3

NetDRMS requires that a number of python packages and modules be present that are not generally part of a system installation. In addition, many scripts require python3 and not python2. The easiest way to satisfy these eeds is to install a data-science-oriented python3 distribution, such as Anaconda. In that vein, install Anaconda into an appropriate installation directory such as /opt/anaconda3. To locate the Linux installer, visit https://docs.anaconda.com/anaconda/install/linux/:

$ curl -OL 'https://repo.anaconda.com/archive/Anaconda3-2019.10-Linux-x86_64.sh'
$ sha256sum Anaconda3-2019.10-Linux-x86_64.sh
46d762284d252e51cd58a8ca6c8adc9da2eadc82c342927b2f66ed011d1d8b53  Anaconda3-2019.10-Linux-x86_64.sh
$ sudo bash Anaconda3-2019.10-Linux-x86_64.sh

After some initial prompts, the installer will display

PREFIX=/home/<user>/anaconda3

This path is the default installation directory (<user> is the user running bash). Replace the PREFIX path with <Anaconda3 install dir>.

Installing NetDRMS

To install NetDRMS, you will need to select an appropriate machine on which to install NetDRMS, an appropriate machine/hardware on which to host the SUMS service, create Linux users and groups, download the NetDRMS release tarball and extract the release source, initialize the Linux environment, create log directories, create the configuration file and run the configuration script, compile and install the executables, create the the DRMS- and SUMS-database users/relations/functions/objects, initialize the SUMS storage hardware, install the SUMS and Remote SUMS daemons.

The optimal hardware configuration will likely depend on your needs, but the following recommendations should suffice for most sites. DRMS and SUMS can share a single host machine. The most widely used and tested Linux distributions are Fedora-based, and at the time of this writing, CentOS is the most popular. Sites have successfully used openSUSE too, but if possible, we would recommend using CentOS. SUMS requires a large amount of storage to hold the DRMS data-series data/image files. The amount needed can vary widely, and depends directly on the amount of data you wish to keep online at any given time. Most NetDRMS sites mirror some amount of (but not all) JSOC SDO data - the more data mirrored, the larger the amount of storage needed. To complicate matters, a site can also mirror only a subset of each data series' data; perhaps one site wishes to retain only the current month's data of many data series, but another wishes to retain all data for one or two series. To decide on the amount of storage needed, you will have to ask the JSOC how much data each series comprises and decide how much of that data you want to keep online. Data that goes offline can always be retrieved automatically from the JSOC again. Data will arrive each day, so request from the JSOC an estimate of the rate of data growth. We recommend doing a rough calculation based upon these considerations, and then doubling the resulting number and installing that amount of storage.

Next, create a production Linux user <NetDRMS production user> (named netdrms_production by default):

$ sudo useradd <NetDRMS production user>
$ sudo passwd <NetDRMS production user>
Changing password for user <NetDRMS production user>.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
$

NOTE: ensure that <NetDRMS production user> is a valid PostgreSQL name because NetDRMS makes use of the PostgreSQL feature whereby attempts to connect to a database are made as the database user whose name matches the name of the Linux user connecting to the database. Please see https://www.postgresql.org/docs/12/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS for a description of valid PostgreSQL names.

As <NetDRMS production user>, you will be running various python components. As such, you should make sure that the PYTHONPATH environment variable is not already set, otherwise it might interfere with the running of Anaconda python. You might also need to modify your PATH environment variable to point to Anaconda and PostgreSQL executables. Modify your .bashrc to do so:

# add to <NetDRMS production user>'s .bashrc 

# to ensure that <NetDRMS production user> uses Anaconda
unset PYTHONPATH

# python executables
export PATH=<Anaconda3 install dir>/bin:$PATH

# PostgreSQL executables
export PATH=<PostgreSQL install dir>/bin:$PATH

For the changes to .bashrc to take effect, either logout/login or source .bashrc.

NetDRMS requires additional python packages not included in the Anaconda distribution, but if you install Anaconda, then the number of additional packages you need to install is minimal. If you have a different python distribution, then you may need to install additional packages. To install new Anaconda packages, as <NetDRMS production user> first create a virtual environment for NetDRMS (named netdrms):

$ whoami
<NetDRMS production user>
$ conda create --name netdrms
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/netdrms_production/.conda/envs/netdrms

Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate netdrms
#
# To deactivate an active environment, use
#
#     $ conda deactivate

Then install the new Anaconda packages using conda:

$ whoami
<NetDRMS production user>
$ conda install -n netdrms psycopg2 psutil python-dateutil
Collecting package metadata (current_repodata.json): done
...

In order for conda to succeed when installing psycopg2, PostgreSQL must have already been installed, and the PostgreSQL executables must be in the PATH environment variable (one step in the installation process is running pg_config. The changes to .bashrc described above should take care of setting the path - make sure you have either re-logged-in or sourced .bashrc.

Create the Linux group <SUMS users>, e.g. sums_users, to which all SUMS users belong, including <NetDRMS production user>. This group will be used to ensure that all SUMS users can create new data files in SUMS:

$ sudo groupadd sums_users

Add <NetDRMS production user> to this group (later you will add each SUMS user - users who will read/write SUMS data files - to this group as well):

$ sudo usermod -a -G sums_users <NetDRMS production user>
$ id <NetDRMS production user>
uid=1001(netdrms_production) gid=1001(netdrms_production) groups=1001(netdrms_production),1002(sums_users)

Select <NetDRMS root>, the root directory of the NetDRMS source tree installed by <NetDRMS production user>. A typical choice for <NetDRMS root> is /opt/netdrms or /usr/local/netdrms, and a typical strategy is to install the source tree into a directory that contains the release version in its name (<NetDRMS install dir>), e.g., /opt/netdrms-9.3. Then <NetDRMS production user> makes a link from <NetDRMS root> to <NetDRMS install dir>. This facilitates the maintenance of multiple releases. To switch between releases, <NetDRMS production user> simply updates the link to point to the desired release directory. Create <NetDRMS install dir> and make <NetDRMS production user> the owner:

$ sudo mkdir -p <NetDRMS install dir>
$ sudo chown <NetDRMS production user>:<NetDRMS production user> <NetDRMS install dir>

As <NetDRMS production user>, obtain a NetDRMS tarball from http://jsoc.stanford.edu/netdrms/dist/ and extract it into a release-specific directory:

$ cd <NetDRMS install dir>
$ curl -OL 'http://jsoc.stanford.edu/netdrms/dist/netdrms_X.X.tar.gz'
$ tar xvzf netdrms_X.X.tar.gz
$ <ctrl-d>

Create the link from <NetDRMS root> to <NetDRMS install dir>:

$ cd <NetDRMS install dir>/..
$ sudo ln -s <NetDRMS install dir> netdrms
$ 

As <NetDRMS production user>, set the two environment variables that are needed for proper NetDRMS operation. To do so, you'll need first to determine the appropriate <architecture> string for one of these variables:

$ whoami
<NetDRMS production user>
$ cd <NetDRMS root>
$ build/jsoc_machine.csh
<architecture>

It is best to set the following two environment variables in <NetDRMS production user>'s .bashrc file since they must always be set whenever any NetDRMS code is run:

# .bashrc
export JSOCROOT=<NetDRMS root>
export JSOC_MACHINE=<architecture>

Make the SUMS log directory on the SUMS server machine. Various SUMS log files will be written to this directory. A suitable directory would reside in the <NetDRMS production user> user's home directory, e.g., $HOME/log/SUMS

$ whoami
<NetDRMS production user>
$ mkdir -p <SUMS logs>

Select appropriate C and Fortran compilers. The DRMS part of NetDRMS must be compiled with a C compiler. NetDRMS supports both the GNU C compiler (gcc), and the Intel C++ compiler (icc). Certain JSOC-specific code requires Fortran compilation. For those projects, NetDRMS supports the GNU Fortran compiler (gfortran), and the Intel Fortran compiler (ifort). SUMS is implemented as a Python daemon, so no compilation step is needed. Both GNU and Intel are widely used, so feel free to use either. By default, Intel compilers are used. There are two methods for changing the compilers:

Create the <NetDRMS root>/config.local configuration file, using <NetDRMS root>/config.local.newstyle.template as a template. This file contains a number of configuration parameters, along with detailed descriptions of what they control and suggested values for those parameters. The configuration script, configure, reads this file, and then creates one output file, drmsparams.*, in <NetDRMS root>/localization for each of several programming languages/tools (C, GNU make, perl, python, bash). In this manner, the parameters are directly readable by several languages/tools used by NetDRMS. Lines that start with whitespace or the hash symbol, # are ignored.

Several sections compose config.local:

__STYLE__
new

__DEFS__
# these are NetDRMS-wide parameter values; the format is <quote code>:<parameter name><whitespace>+<parameter value>;
# the configuration script uses <quote code> to assist in creating language-specific parameters; <quote code> is one of:
#   'q' (enclose the parameter value in double quotes)
#   'p' (enclose the parameter value in parentheses)
#   'a' (do not modify the parameter value). 

__MAKE__
# these are make variables used by the make system during compilation - they generally contain paths to third-party code

Before creating config.local, please request from the JSOC a value for DRMS_LOCAL_SITE_CODE. This code uniquely identifies each NetDRMS installation. Each site requires one ID for each of its NetDRMS installations.

The __MAKE__ section:

The __MAKE__ section:

When installing NetDRMS updates, copy the existing config.local to the new <NetDRMS install dir> and edit the copy as needed, using the new config.local.newstyle.template to obtain information about parameters new to the newer release. Many of the parameter values have been determined during the previous steps of the installation process.

Run the configuration script, configure, a csh shell script which is included in <NetDRMS install dir>.

configure reads config.local and uses the contained parameters to configure many of the NetDRMS features. It creates several directories in <NetDRMS install dir>:

Compile DRMS. To make the DRMS part of NetDRMS, run:

As <PostgreSQL superuser>, create the DRMS database production user <DRMS DB production user>. Since PostgreSQL automatically attempts to use the Linux user name as the PostgreSQL user name when a connection attempt is made, use Linux user <NetDRMS production user> for database user <DRMS DB production user>:

$ whoami
<PostgreSQL superuser>
$ psql netdrms
postgres=# CREATE ROLE <DRMS DB production user>;
postgres=# \q
$ 

As <NetDRMS production user>, run psql to add a password for this new database user:

$ whoami
<NetDRMS production user>
$ psql netdrms
netdrms=> ALTER ROLE <DRMS DB production user> WITH PASSWORD '<new password>';
netdrms=> \q
$

As <NetDRMS production user>, create a .pgpass file. This file contains the PostgreSQL user account password, obviating the need to manually enter the database password each time a database connection attempt is made:

$ whoami
<NetDRMS production user>
$ cd $HOME
$ vi .pgpass
i
<PostgreSQL host>:*:*:<DRMS DB production user>:<new password>
ESC
:wq
$ chmod 0600 .pgpass

Run an SQL script, and a perl script (which executes several SQL scripts), both included in the NetDRMS installation, to create the admin and drms schemas and their relations, the jsoc and sumsadmin database users, data types, and functions:

$ ssh <PostgreSQL superuser>@<PostgreSQL host>
# use psql to execute SQL script
$ psql -p <PostgreSQL port> netdrms -f <NetDRMS install dir>/base/drms/scripts/NetDRMS.sql 
$ perl <NetDRMS install dir>/base/drms/scripts/createpgfuncs.pl

For more information about the purpose of these objects, read the comments in the NetDRMS.sql and createpgfuncs.pl.

Create the SUMS database production user <SUMS production user>. Since PostgreSQL automatically attempts to use the Linux user name as the PostgreSQL user name when a connection attempt is made, use Linux user <NetDRMS production user> for database user <SUMS production user>:

$ whoami
<NetDRMS production user>
$ psql netdrms
postgres=# CREATE ROLE <SUMS production user>;
postgres=# \q
$ 

Create the SUMS database relations and sequences:

$ psql -h localhost -p <PostgreSQL port> -U production -f base/sums/scripts/postgres/create_sums_tables.sql data_sums


% psql -h <db server host> -p 5434 -U postgres data_sums
data_sums=# ALTER SEQUENCE sum_ds_index_seq START <min val> RESTART <min val> MINVALUE <min val> MAXVALUE <max val>

where <min val> is <drms site code> << 48, and and <max val> is <min val> + 281474976710655 (2^48 - 1), and <drms site code> is the value of the DRMS_SITE_CODE parameter in config.local.

===Create NetDRMS User Accounts===

1. Each user of DRMS, including the production user, must set two environment variables in their environment:These two environment variables should be set in each user's .bashrc file. They must be set to use NetDRMS properly.

  1. Make .pgpass files and ensure that they work. You'll know you've done it right when the production user can connect to the database via "psql" without being prompted for a password. To do this, create a .pgpass file in the production user's home directory. Please click here for information on the .pgpass file, or read the Postgres documentation web site for more information. It is important that the permissions for the .pgpass file are set to 600, readable only to the individual user. You will need to adjust your pg_hba.conf settings in Postgres in order for the .pgpass file to correctly work, and if you need to change pg_hba.conf later, you'll need to recycle the database to get it to see the new settings. It is important that you fully test your .pgpass access with at least one user before proceeding; much depends on its working. If you cannot get it to work and need to step backward with less security, add the
    local    local     trust line back into pg_hba.conf and restart the database using % pg_ctl restart.

  1. Create database accounts for DRMS users. To use DRMS software/modules, a user of this software must have an account on the DRMS database (a DRMS series is implemented as several database objects). The software, when run, will log into a user account on the DRMS database - by default, the name of the user account is the name of the linux user account that the DRMS software runs under.
    1. Run the newdrmsuser.pl script. This script, and some other Perl scripts that follow, have a dependency on the DBD::Pg Perl package. Please ensure this package has been installed before proceeding. When you run newdrmsuser.pl, you will be prompted for the postgres dbuser password:

      % $JSOCROOT/base/drms/scripts/newdrmsuser.pl data <db server host> 5432 <db user> <initial password> <db user namespace> user 1

      where <db user> is the name of the user whose account is to be created and <db user namespace> is the namespace DRMS should use when running as the db user and reading or writing database tables. DRMS uses <db user namespace> to store user-specific database information, including DRMS data series information owned by that user. The namespace is a logical container of database objects, like database tables, sequences, functions, etc. The names of all objects are qualified by the namespace. For example, to unambiguously refer to the table "mytable", you prepend the name with the namespace. So, for example, if this table is in the su_production namespace (container), then you refer to the table as "su_production.mytable". In this way, there can be other tables with the same name, but that reside in a different namespace (e.g., su_arta.mytable is a different table that just happens to have the same name). Please see the NOTE in this page for assistance with choosing a namespace. <initial password> is the initial password for this account. This is another useful place for you to test your .pgpass files if you have access to a home account for testing purposes, such as your own user account. You may have a mis-configuration in your pg_hba.conf file that would make it appear that .pgpass was not working.

    2. Have the user that owns the account change the password:

      % psql -h <db server host> -p 5432 data
      data=> ALTER USER <db user> WITH PASSWORD '<new password>';

      where <new password> is the replacement for the original password. It must be enclosed in single quotes.

    3. Have the user put their password in their .pgpass file. Please click here for information on the .pgpass file. This file allows the user to login to their database account without having to provide a password at a prompt.

    4. Create a db account for the linux production user (the name is the value of the SUMS_MANAGER parameter in config.local). The name of the database user for this linux user is the same as the name of the linux user (typically 'production'). Follow the previous steps to use newdrmsuser.pl to create this database account. A good namespace for this account is <drms site>_production - this is what you'd use for <db user namespace>.

    5. Create a password for the sumsadmin DRMS database user, following the "ALTER USER" directions above. The user was created by the NetDRMS.sql script above.
    6. OPTIONALLY, create a table to be used for DRMS version control:
      % psql -h <db server host> -p 5432 -U <postgres administrator> data
      data=> CREATE TABLE drms.minvers(minversion text default '1.0' not null);
      data=> GRANT SELECT ON drms.minvers TO public;
      data=> INSERT INTO drms.minvers(minversion) VALUES(<version>);
      where <version> is the minimum DRMS version that a DRMS module must have before it can connect to the DRMS database.

Set Up the SUMS database

  1. Although the SUMS data cluster and SUMS database have been already created, you must create certain tables and users in this newly created database.
    1. Create the production user in the SUMS database:

      % psql -h <db server host> -p 5434 data_sums -U postgres
      data_sums=# CREATE USER <db production user> PASSWORD '<password>';

    2. Create a read-only user in the SUMS database (so users can read the SUMS DB tables):

      % psql -h <db server host> -p 5434 data_sums -U postgres
      data_sums=# CREATE USER readonlyuser PASSWORD '<password>';
      data_sums=# GRANT CONNECT ON DATABASE data_sums TO readonlyuser;

    3. Put the DRMS production db user into the sumsadmin group:

      % psql -h <db server host> -p 5432 data -U postgres
      data=# GRANT sumsadmin TO <db production user>;

      sum_rm, when run properly by the linux production user, will attempt to connect to the DRMS database as <db production user>. By putting it into the sumsadmin DB user group, we are giving sum_rm the ability to delete any record in any DRMS data-series record table. This permission is required for the archive == -1 implementation; this is the feature that causes SUMS to delete DRMS records from series whose archive flag is -1 when the DRMS records' SUs are deleted.

    4. Put the production user's password into the .pgpass file. Please click here for information on the .pgpass file.

    5. SUMS data files are organized into "partitions" which are implemented as directories. Each partition must be named /SUM[0-9]+ (e.g., /SUM0, /SUM1, ..., /SUM58, ..., /SUM99, /SUM100, /SUM101, ...). Each directory must be owned by the production linux user (e.g., "production"). The linux group to which the directories belong must be the SUMS-user group (set-up in step 1b. in the Users and Environment section, e.g. sumsuser). All SUMS users must be a member of this group. For example, if linux user art will be using DRMS and running DRMS modules that access SUMS files, then art must be a member of the SUMS user group (e.g., sumsuser). You are free to create as few or many of these partitions as you desire. Create these directories now.

      NOTE: Please avoid using file systems that limit the number of directories and/or files. For example, the EXT3 file system limits the number of directories to 64K. That number is far too small for SUMS usage.

    6. Initialize the sum_partn_avail table with the names of these partitions. For each SUMS partition run the following:

      % psql -h <db server host> -p 5434 -U postgres data_sums
      data_sums=# INSERT INTO sum_partn_avail (partn_name, total_bytes, avail_bytes, pds_set_num, pds_set_prime) VALUES ('<SUMS partition path>', <avail bytes>, <avail bytes>, 0, 0);

      where <SUMS partition path> is the full path of the partition (the path must be enclosed in single quotes) and <avail bytes> is some number less than the number of bytes in the directory (multiply the number of blocks in the directory by the number of bytes per block). The number does not matter, as long as it is not bigger than the total number of bytes available. SUMS will adjust this number as needed.

Test your Postgres database installations

  1. Make sure you as production and at least one other user name can log in to both the sums and drms database instances without a password prompt using psql and your .pgpass file.

  2. Do a \dt in both databases and check that you can see tables listed.
  3. Select * from sum_partn_avail table and make sure that your sums partitions are accurately entered.

Third Party Software for NetDRMS

You will need the following third party packages and main package libraries installed before compiling or the compilation will not work. Please note that these are examples from some successful installations, but your own machine may already be configured correctly or it may need an entirely different bunch of stuff installed to get to the same place. It's possible that even with the following installed, during your make you may see that you need further packages or libraries.

--Perl for scripts: V. 5.10 minimum; you may want development libraries installed. (Note that your OS may be relying on an old version of Perl and installing a new one directly on top of it may cause you strange and unexpected problems; parallel installation may be necessary.)

--Python, version 2.7 or higher (Note that some CentOS versions expect a lower version of Python for their own purposes, and installing directly on top of the existing Python may cause unexpected problems):
python33-python.x86_64
python33-python-devel.x86_64
python33-python-libs.x86_64

--To build hpn-ssh for regular file exchange with JSOC:
See instructions on http://www.psc.edu/index.php/hpn-ssh , which will first instruct you to get the OpenSSH source code from OpenSSH.org. You will also need to install the "patch" package if it's not on your machine already, to put your hpn-ssh code together.

Running SUMS Services

To launch the SUMS daemon, sumsd.py, use the start-mt-sums.py script:

$ ssh <production user>@<SUMS host>
$ sudo python3 start-mt-sums.py daemon=<path>/sumsd.py ports=6102 --instancesfile=sumsd-instances.txt --logfile=sumsd-6102-20190627.txt --loglevel=info

The complete usage is:

usage: start-mt-sums.py daemon=<path to daemon> ports=<listening ports> [ --instancesfile=<instances file path> ] [ --loglevel=<critical, error, warning, info, or debug>] [ --logfile=<file name> ] [ --quiet ]

optional arguments:
  -h, --help            show this help message and exit
  -i <instances file path>, --instancesfile <instances file path>
                        the json file which contains a list of all the
                        sumsd.py instances running
  -l LOGLEVEL, --loglevel LOGLEVEL
                        specifies the amount of logging to perform; in order
                        of increasing verbosity: critical, error, warning,
                        info, debug
  -L <file name>, --logfile <file name>
                        the file to which sumsd logging is written
  -q, --quiet           do not print any run information

required arguments:
  d <path to daemon>, daemon <path to daemon>
                        path of the sumsd.py daemon to launch
  p <listening ports>, ports <listening ports>
                        a comma-separated list of listening-port numbers, one
                        for each instance to be spawned

start-mt-sums.py will fork one or more sumsd.py daemon processes. The ports argument identifies the SUMS host ports on which sumsd.py will listen for client (DRMS module) requests. One sumsd.py process will be invoked per port specified. The instances file and log file reside in the path identified by the SUMLOG_BASEDIR config.local parameter. The instances file is used to track the running instances of sumsd.py and is used by stop-mt-sums.py to identify running daemons.

To stop one or more SUMS services, use the stop-mt-sums.py script:

$ ssh <production user>@<SUMS host>
$ sudo python3 stop-mt-sums.py daemon=<path>/sumsd.py --ports=6102 --instancesfile=sumsd-instances.txt

The complete usage is:

usage: stop-mt-sums.py [ -h ] daemon=<path to daemon> [ ---ports=<listening ports> ] [ --instancesfile=<instances file path> ] [ --quiet ]

optional arguments:
  -h, --help            show this help message and exit
  -p <listening ports>, --ports <listening ports>
                        a comma-separated list of listening-port numbers, one
                        for each instance to be stopped
  -i <instances file path>, --instancesfile <instances file path>
                        the json file which contains a list of all the
                        sumsd.py instances running
  -q, --quiet           do not print any run information

required arguments:
  d <path to daemon>, daemon <path to daemon>
                        path of the sumsd.py daemon to halt

Registering for Subscriptions

A NetDRMS site can optionally register for a data-series subscription to any NetDRMS site that offers such a service. The JSOC NetDRMS offers subscriptions, but at the time of this writing, no other site does. Once a site registers for a data series subscription, the site will become a mirror for that data series. The subscription process ensures that the mirroring site will receive regular updates made to the data series by the serving site. The subscribing site can configure the interval between updates such that the mirror can synchronize with the server and receive updates within a couple of minutes, keeping the mirror up-to-date in (almost) real time.

To register for a subscription, run the subscribe.py script (included in the base NetDRMS installation). This script makes subscription requests to the serving site's subscription-manager. The process entails the creation of a snapshot of the data-series at the serving site. Those data are downloaded, via HTML, to the subscribing site, where they are ingested by subscribe.py. Ingestion results in the creation of the DRMS database objects that maintain and store the data series. At this time, no SUMS data files are downloaded. Instead, and optionally, the IDs for the series' SUMS Storage Units (SU) are saved in a database relation. Other NetDRMS daemons can make use of this relation to automatically download and ingest the SUs into the subscriber's SUMS. The Remote SUMS Client, rsums-clientd.py, manages this list of SUs, making SU-download requests to another client-side daemon, Remote SUMS, rsumsd.py. rsumsd.py accepts SU requests from rsums-clientd.py, downloading SUs via scp - each scp instance downloads multiple SUs.

The automatic download of data-series SUs is optional. They can be downloaded on-demand as well. In fact, if the subscribing NetDRMS site were to automatically download an SU, then delete the SU (there is a method to do this, described later), then an on-demand download is the only way to re-fetch the deleted SU. On-demand downloads happen automatically; any DRMS module that attempts to access an SU (like with a show_info -P command) that is not present for any reason will trigger an rsumsd.py request. The module will pause until the SU has been downloaded, then automatically resume its operation on the previously missing SU.

As rsumsd.py uses scp to automatically download SUs, SSH public-private keys must be created at the subscribing site, and the public key must be provided to the serving site. Setting this up requires coordinated work at both the susbscribing and serving sites:

  1. On the subscribing site, run

$ sudo su - <production user>
$ ssh-keygen -t rsa

This will allow you to create a passphrase for the key. If you choose to do this, then save this phrase for later steps. In the home directory of <production user>, ssh-keygen will create a public key named id_rsa.pub.

  1. Provide id_rsa.pub to the serving site

  2. The serving site must then add the public key to its list of authorized keys. If the .ssh directory does not exist, then the serving site must first create this directory and give it 0700 permissions. If the authorized_keys file in .ssh does not exist, then it must first be created and given 0644 permissions:

$ sudo su - <subscription production user>
$ mkdir .ssh
$ chmod 0700 .ssh
$ cd .ssh
$ touch authorized_keys
$ chmod 0644 authorized_keys

Once the .ssh and authorized_keys files exist and have the proper permissions, the serving site administrator can then add the client site's public key to its list of authorized keys:

$ sudo su - <subscription production user>
$ cd <subscription production user home directory>/.ssh
$ cat id_rsa.py >> authorized_keys
  1. If an SSH passphrase was chosen in step 1, then back at the client site, <production user> must start an ssh-agent instance to automate the passphrase authentication. If no passphrase was provided in step 1, this step can be skipped. Otherwise, run (assuming bash syntax - read the man page for csh syntax):

$ sudo su - <production user>
$ ssh-agent > ~/.ssh-agent
$ source ~/.ssh-agent # needed for ssh-add, and also for rsumsd.py and get_slony_logs.pl
$ ssh-add ~/.ssh/id_rsa

To keep ingested data series synchronized with changes made to it at the serving site, a client-side cron tab runs periodically. It runs get_slony_logs.pl, a perl script that uses scp to download "slony log files" - SQL files that insert, delete, or update database relation rows. get_slony_logs.pl communicates with the Slony-I replication software running at the serving site. Slony-I generates these log (SQL) files at the server which are then downloaded by the client.

To register for a subscription to a new series, run:

You may find that a subscription has gotten out of sync, for various reasons, with the serving site's data series (accidental deletion of database rows, for example). subscribe.py can be used to alleviate this problem. Run the following to re-do the subscription registration:

Finally, there might come a time where you no longer which to hold on to a registration. To remove the subscription from your set of registered data series run:

for example, the JSOC maintains time-distance analysis code that is part of the JSOC DRMS code tree, but it is not part of the base NetDRMS package provided to remote sites; it is possible for a NetDRMS site to install such project code by modifying a configuration file (config.local); this may require the installation of third-party software, such as math libraries and mpi.

Performing a Test Run

At this point, it is a good idea to test your installation. Although you have no DRMS/SUMS data at this point, running show_series is a good way to test various components, like authentication, database connection, etc. To test SUMS, however, you will need to have a least one DRMS data series that has SUMS data. You can obtain such a data series by using the subscription system.

Test DRMS by running the show_series command:

$ show_series

If you see no errors, then life is good.

After you have a least one data series, then you can do more thorough testing. For example, you can run:

$ show_info -j <DRMS data series>

To test SUMS (once you have some data files in your NetDRMS), you can run:

$ show_info -P <DRMS record-set specification>

To update to a newer NetDRMS release, simply create a new directory to contain the build, copy the previous config.local into the new <JSOC root> and edit it if new parameters have been added to config.local, and follow the directions for compiling DRMS. Any previous-release daemons that were running will need to be shut down, and the daemons in the newer release started.

Deciding what's next

You may wish to run a JMD or use Remote SUMS. The decision should be discussed with JSOC personnel. Once you've made this decision and installed the appropriate software (see below for Remote SUMS), you'll need to populate your DRMS database with data. For this, you'll need to be a recipient of Slony subscription data. We recommend contacting the JSOC directly to become a subscriber.

Remote SUMS

A local NetDRMS may contain data produced by other, non-local NetDRMSs. Via a variety of means, the local NetDRMS can obtain and ingest the database information for these data series produced non-locally. In order to use the associated data files (typically image files), the local NetDRMS must download the storage units (SUs) associated with these data series too. There are currently two methods to facilitate these SU downloads. The Java Mirroring Daemon (JMD) is a tool that can be installed and configured to download SUs automatically as the series data records are ingested into the local NetDRMS. It fetches these SUs before they are actually used. It can obtain the SUs from any other NetDRMS that has the SUs, not just the NetDRMS that originally produced them. Remote SUMS is a built-in tool that comes with NetDRMS. It downloads SUs as needed - i.e., if a module or program requests the path to the SU or attempts to read it, and it is not present in the local SUMS yet, Remote SUMS will download the SUs. While the SUs are being downloaded, the initiating module or program will poll waiting for the download to complete.

Several components compose Remote SUMS. On the client side, the local NetDRMS, is a daemon that must be running (rsumsd.py). There also must exist some database tables, as well as some binaries used by the daemon. On the server side, all NetDRMS sites that wish to act as a source of SUs for the client, is a CGI (rs.sh). This CGI returns file-server information (hostname, port, user, SU paths, etc.) for the SUs the server has available in response to requests that contain a list of SUNUMs. When the client encounters requests for remote SUs that are not contained in the local SUMS, it requests the daemon to download those SUs. The client code then polls waiting for the request to be serviced. The daemon in turn sends requests to all rs.sh CGIs at all the relevant providing sites. The owning sites return the file-server information to the daemon, and then the daemon downloads the SUs the client has requested, via scp, and notifies the client module once the SUs are available for use. The client module will then exit from its polling code and continue to use the freshly downloaded SUs.

To use Remote SUMS, the config.local configuration file must first be configured properly, and NetDRMS must be re-built. Here are the relevant config.local parameters:

After setting-up config.local, you must build or re-build NetDRMS:

% cd $JSOCROOT
% configure
% make

It is important to ensure that three binaries needed by the Remote SUMS daemon have been built: jsoc_fetch, vso_sum_alloc, vso_sum_put.

Ensure that Python >= 2.7 is installed. You will need to install some package if they are not already installed: psycopg2, ...

An output log named rslog_YYYYMMDD.txt will be written to the directory identified by the RS_LOGDIR config.local parameter, so make sure that directory exists.

Provide all providing NetDRMS sites your public SSH key. They will need to put that key in their authorized_keys file.

Create the client-side Remote SUMS database tables. Run:

% $JSOCROOT/base/drms/scripts/rscreatetabs.py op=create tabs=req,su

Start the rsumsd.py daemon as the user specified by the RS_DBUSER config.local parameter. As this user, start an ssh-agent process and add the public key to it:

% ssh-agent -c > $HOME/.ssh-agent_rs
% source $HOME/.ssh-agent_rs
% ssh-add $HOME/.ssh/id_rsa

This will allow you to create a public-private key that has a passphrase while obviating the need to manually enter that passphrase when the Remote SUMS daemon runs scp.

Start SUMS:

% $JSOCROOT/base/sums/scripts/sum_start.NetDRMS >& <log dir>/sumsStart.log

Substitute your favorite log directory for <log dir>. There is another daemon, sums_procck.py, that keeps SUMS up and running once it is started. Redirecting to a log will preserve important information that this daemon prints. To stop SUMS, use $JSOCROOT/base/sums/scripts/sum_stop.NetDRMS.

Start the Remote SUMS daemon:

% $JSOCROOT/base/drms/scripts/rsumsd.py

Subscribing to Series