Differences between revisions 148 and 149
Revision 148 as of 2019-12-12 02:36:38
Size: 43591
Editor: ArtAmezcua
Comment:
Revision 149 as of 2019-12-12 02:41:29
Size: 43712
Editor: ArtAmezcua
Comment:
Deletions are marked like this. Additions are marked like this.
Line 252: Line 252:
To launch the SUMS daemon, sumsd.py, use the {{{start-mt-sums.py}}} script:

{{{
$ sudo python3 start-mt-sums.py
}}}

NetDRMS - a shared data management system

Introduction

In order to process, archive, and distribute the substantial quantity of solar data captured by the Atmospheric Imaging Assembly (AIA) and Helioseismic and Magnetic Imager (HMI) instruments on the Solar Dynamics Observatory (SDO), the Joint Science Operations Center (JSOC) has developed its own data-management system, NetDRMS. This system comprises two PostgreSQL databases, multiple file systems, a tape back-up system, and software to manage these components. Related sets of data are grouped into data series, each, conceptually, a table of data where each row of data typically associated with an observation time, or a Carrington rotation. As an example, the data series hmi.M_45s contains the HMI 45-second cadence magnetograms, both observation metadata and image FITS files. The columns contain metadata, such as the observation time, the ID of the camera used to acquire the data, the image rotation, etc. One column in this table contains an ID that refers to a set of data files, typically a set of FITS files that contain images.

The Data Record Management System (DRMS) is the subsystem that contains and manages the "DRMS" database of metadata and data-file-locator information. One component is a software library, written in C, that provides client programs, also known as "DRMS modules", with an Application Programming Interface (API) that allows the users to access these data. The Storage Unit Management System (SUMS) is the subsystem that contains and manages the "SUMS" database and associated storage hardware. The database contains information needed to locate data files that reside on hardware. The entire system as a whole is typically referred to as DRMS. The user interfaces with the DRMS subsystem only, and the DRMS subsystem interfaces with SUMS - the user does not interact with SUMS directly. The JSOC provides NetDRMS to non-JSOC institutions so that those sites can take advantage of the JSOC-developed software to manage large amounts of solar data.

A NetDRMS site is an institution with a local NetDRMS installation. It does not generate the JSOC-owned production data series (e.g., hmi.M_720s, aia.lev1) that Stanford generates for scientific use. A NetDRMS site can generate its own data, production or otherwise. That site can create software that uses NetDRMS to generate its own data series. But it can also act as a "mirror" for individual data series. When acting as a mirror for a Stanford data series, the site downloads from Stanford DRMS database information and stores it in its own NetDRMS database, and it downloads SUMS files, and stores them in its own SUMS subsystem. As the data files are downloaded to the local SUMS, the SUMS database is updated with the information needed to manage the data files. It is possible for a NetDRMS site to mirror the DRMS data of any other NetDRMS site, but at this point, the only site whose data are currently mirrored is the Stanford JSOC.

Installing NetDRMS

Installing the NetDRMS system requires:

  • selecting appropriate machines to host DRMS, SUMS, the PostgreSQL relational database management system software, and database clusters [ #select-hosts ]

  • installing PostgreSQL #install-pg

  • instantiating two PostgreSQL clusters, one for DRMS and one for SUMS
  • creating PostgreSQL users, databases, relations, procedures, and other objects
  • installing CFITSIO
  • installing packages to the system Python 3, or installing a new distribution, like Anaconda
  • installing the NetDRMS software code tree, which includes code to create DRMS libraries and modules and SUMS libraries [ #install-netdrms ]

  • initializing storage hardware such as hard drives or SSD drives
  • instantiating various NetDRMS daemons, such as the SUMS daemon (which accepts and processes SUMS requests from DRMS clients)
  • creating NetDRMS user accounts

Optional steps include:

  • installing JSOC-specific project code that is not part of the base NetDRMS installation; the JSOC maintains code to generate JSOC-owned data that is not generally of interest to NetDRMS sites, but sites are welcome to obtain downloads of that code. Doing so involves additional configuration to the base NetDRMS system.
  • running an ssh agent daemon to provide non-interactive access to the data at other NetDRMS sites and the JSOC
  • subscribing to JSOC data series and running NetDRMS software to receive, in real time, data updates
  • installing Slony PostgreSQL data-replication software to become a provider of your site's data
  • installing a webserver that hosts several NetDRMS CGIs to allow web access to your data
  • installing the Virtual Solar Observatory (VSO) software to become a VSO provider of data

For best results, and to facilitate debugging issues, please follow these steps in order.

If you want to receive replicated data from JSOC, you'll need to install some scripts, and work with your ssh keys and a software called hpn-ssh. for example, the JSOC maintains time-distance analysis code that is part of the JSOC DRMS code tree, but it is not part of the base NetDRMS package provided to remote sites; it is possible for a NetDRMS site to install such project code by modifying a configuration file (config.local); this may require the installation of third-party software, such as math libraries and mpi.

  • i

First, you will need to create a few linux users and groups, giving them the needed permissions (see step 1 below). Second, you will need to install the PostgreSQL Relational Database Management System (PG) and create two databases (see step 2 below). Third, you will need to establish disk storage for SUMS (see "Setting up a SUMS" below). Fourth, you will need to install third-party libraries needed by DRMS and SUMS (see X below). Fifth, you will need to build and install NetDRMS and SUMS (see X below).

To install NetDRMS and SUMS, please follow these directions in order. All accounts/paths/ports/etc. referenced can be modified, but we recommend not doing this unless you are certain they must be different. Debugging issues from Stanford becomes difficult if every site does things differently.

Bear in mind that you may have to change the ownership and permissions on the $DRMS directory as you go through the install process and determine the user that will run the code.

Selecting Host Machines

The optimal hardware configuration will likely depend on your needs, but the following recommendations should suffice for most sites. DRMS and SUMS can share a single host machine. The most widely used and tested Linux distributions are Fedora-based, and at the time of this writing, CentOS is the most popular. Sites have successfully used openSUSE too, but if possible, we would recommend using CentOS.

Initializing the Linux Environment

  • Initialize your linux environment for NetDRMS installation (to be done by a superuser or sudoer):
  • Create a production linux user <production user> (named production by default):

$ sudo useradd production
$ sudo passwd production
Changing password for user production.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
$

<production user> becomes the value of the SUMS_MANAGER parameter in the config.local file.

  1. Create the linux group <SUMS users>, to which all SUMS users belong, including <production user>, e.g. sums-users:

$ sudo groupadd sums-users
  1. Add all SUMS users (users who will read/write SUMS data files) to <SUMS users>:

$ sudo usermod -a -G sums-users drms-production
$ id drms-production
uid=1002(drms-production) gid=1002(drms-production) groups=1002(drms-production),1003(sums-users)
  1. Each user of DRMS, including the production user, must set two environment variables in their environment:

% setenv JSOCROOT <DRMS source tree root>
% setenv JSOC_MACHINE <OS and CPU>

<DRMS source tree root> is the root of the DRMS source tree installed by the production linux user, and <OS and CPU> is "linux_x86_64", if DRMS was installed on a machine with a Linux OS and a 64-bit processor, or "linux_avx", if DRMS was installed on a machine with a Linux OS and a 64-bit processor that supports Advanced Vector Extensions (which supports an extended instruction set). You may wish to have the NetDRMS software installed and compiled before you put the $JSOC_MACHINE variable into play.

  1. Create the SUMS log directory on the SUMS server machine, if it does not already exist. The name/path for this directory is defined in config.local in the SUMS_LOG_BASEDIR parameter. The actual directory must match the value of this parameter, which defaults to /usr/local/logs/SUM. You are free to change this path in SUMS_LOG_BASEDIR, to, say, /var/log/sums or whatever is consistent with your system logs. This directory must be writeable by the linux production user.

Installing PostgreSQL

  1. Select a host machine, <PostgreSQL host>, to act as the PostgreSQL database server.

  2. Install the needed PostreSQL server packages on <PostgreSQL host>; the following assumes a Fedora-based linux system such as CentOS (documentation for other distributions, such as Debian and openSUSE can be found online):

$ sudo yum install <package>
  1. PostgreSQL 10 (as of the time of this wiki update, version 10 was the latest version - use the latest version if possible):
  2. postgresql10
  3. postgresql10-devel
  4. postgresql10-libs
  5. postgresql10-server
  6. Previous versions of PostgreSQL:
  7. postgresql
  8. postgresql-devel
  9. postgresql-libs
  10. postgresql-server
  11. Install the needed PostgreSQL client packages on all hosts on which DRMS modules will be built and executed:
  12. PostgreSQL 10:
  13. postgresql10
  14. postgresql10-devel
  15. postgresql10-libs
  16. Previous versions of PostgreSQL:
  17. postgresql
  18. postgresql-devel
  19. postgresql-libs

The rpm package installation will have created the PostgreSQL superuser linux accout <PostgreSQL superuser> (i.e., postgres); <PostgreSQL superuser> will own the PostgreSQL database clusters and server processes that will be created in the following steps.

  1. Create the database clusters for the two database instances (one instance for DRMS data, and one for SUMS data). PostgreSQL is a Database Management System that supports multiple independent database instances. The disk files needed for data storage for each database instance reside on a disk storage location known as a database cluster. Each cluster contains storage for one or more database instances - in the case of NetDRMS, the DRMS database resides in one cluster, and the SUMS database resides in a second cluster.

    1. The DRMS cluster:

$ sudo initdb --locale=C -D /var/lib/pgsql/data}

A database cluster is a storage area on disk that contains the data for one or more databases. The storage area is implemented as a directory (the data directory) and it is managed by a single instance of a Postgres server process. To create this cluster (data directory), first log-in as the linux user postgres, and then run the initdb command:

% initdb --locale=C -D /var/lib/pgsql/data

This will create the data directory /var/lib/pgsql/data on the database server host. If you want to place the data in a different directory, go right ahead and change the -D parameter value. The "--locale" argument will set cluster locale to "C". Locale support refers to an application respecting cultural preferences regarding alphabets, sorting, number formatting, etc. PostgreSQL uses the standard ISO C and POSIX locale facilities provided by the server operating system. We recommend "C" and make no guarantees what will happen to your formatting if you deviate.

  1. Also as user postgres, create a database cluster for the SUMS data. This cluster is distinct from the cluster for the DRMS data, and it is maintained by a separated server instance:

    % initdb --locale=C -D /var/lib/pgsql/data_sums

    This will create the data directory /var/lib/pgsql/data_sums on the database server host (or wherever you've decided to put the cluster with the -D parameter).

  2. Edit the Postgres configuration files - you will have these in two different places, one for each cluster you created with initdb. The configuration files are cluster-specific, and they reside in the data directory created by the initdb command. These are the key parameters which will determine your database efficiency and security. A complete list of all modifiable parameters can be found in the Postgres online documentation, but a few are worth mentioning now.
    1. listen_addresses (in postgresql.conf) is a list of IP addresses from which connections can be made. By default the value of the parameter is "localhost", which disallows IP connections from all machines, except the machine hosting the database server process. This is not what you want. The single-quoted string '*' will allow connections from all machines. If you want to be more restrictive, you can simply provide a comma-separated list of hostnames or IP addresses.
    2. port (in postgresql.conf) is the port on which the server listens for connections. If you create more than one cluster on the host server machine (e.g., if you create both the DRMS and SUMS clusters on a single host), then you'll need to change the port number for at least one cluster (you cannot have two server processes listening for connections on the same port). We suggest using port 5432 for the DRMS cluster (port = 5432 - no quotes), and port 5434 for the SUMS cluster. Note that port 5432 is the default port for Postgres.
    3. logging_collector (in postgresql.conf). Set this to 'on' so that the output of the Postgres server process will be captured into log files and rotated once per day.
    4. log_rotation_size (in postgresql.conf). Set this to 0. This will cause PG to emit one log every day (as opposed to starting a new log after the previous log is a certain size).
    5. log_min_duration_statement (in postgresql.conf). Set this to 1000 so that only queries that are greater than 1000 ms in run time will be logged. Otherwise, the log files will quickly get out of hand.
    6. shared_buffers. Set this to how much memory you want to devote to running the database. The default is 128 MB, so you should increase it. You may also wish to adjust the values for work_mem, maintenance_work_mem, and max_stack_depth, but consult the Postgres manual for a better understanding.
    7. Adjust and learn about the pg_hba.conf file. This file contains lines of the form

      <connection type>  <databases>  <user>  <IP address>  <IP mask>  <authentication method>

      if you wish to use an IP-address mask to specify a range of IP addresses, or

      <connection type>  <databases>  <user>  <CIDR-address>  <authentication method>

      if you wish to use a CIDR-address to specify the range. To get yourself up and running, you'll need to add a line or two to this file. To allow access by one host, we suggest

      host  all  all XXX.XXX.XXX.XXX  255.255.255.255  md5

      or

      host  all  all XXX.XXX.XXX.XXX/32  md5

      For multiple-host access, we suggest

      host  all  all XXX.XXX.XXX.0  255.255.255.0  md5

      or

      host  all  all  XXX.XXX.XXX.0/24  md5 The md5 encryption is what will trigger the use of user .pgpass files. You may also wish to comment out the line
      local    local      trust" - this line allows anyone on the local machine to log in with no password, and isn't good for long term security. Once you've commented out the
      local    local      trust line, you will no longer be able to log in without a .pgpass file correctly made. Please note that whenever you make changes to pg_hba.conf, you will need restart the database server to have changes take effect. You can test your changes once you've started the server.

Start Postgres and Install Data Structures

  1. The remainder of the instructions require that the Postgres servers (there is one for the DRMS cluster, and one for the SUMS cluster) be running. To start-up the server instances run:

    % su postgres
    % pg_ctl start -D /var/lib/pgsql/data # start the DRMS-database cluster server
    % pg_ctl start -D /var/lib/pgsql/data_sums -p 5434 # start the SUMS-database cluster server.

    The server logs will be placed in the pg_log subdirectory for each cluster.

  2. Test pg_hba.conf.
    1. Make .pgpass files and ensure that they work. You'll know you've done it right when the production user can connect to the database via "psql" without being prompted for a password. To do this, create a .pgpass file in the production user's home directory. Please click here for information on the .pgpass file, or read the Postgres documentation web site for more information. It is important that the permissions for the .pgpass file are set to 600, readable only to the individual user. You will need to adjust your pg_hba.conf settings in Postgres in order for the .pgpass file to correctly work, and if you need to change pg_hba.conf later, you'll need to recycle the database to get it to see the new settings. It is important that you fully test your .pgpass access with at least one user before proceeding; much depends on its working. If you cannot get it to work and need to step backward with less security, add the
      local    local     trust line back into pg_hba.conf and restart the database using % pg_ctl restart.

  3. Create the DRMS database in the DRMS cluster, and create the SUMS database in the SUMS cluster:

    % su postgres
    % createdb --locale C -E LATIN1 -T template0 data # create the DRMS database in the DRMS-database cluster
    % createdb --locale C -E LATIN1 -T template0 -p 5434 data_sums # create the SUMS database in the SUMS-database cluster. NOTE: The -E flag sets the character encoding of the characters stored in the database. LATIN1 is not a great choice (it would have been better to have used SQL_ASCII or UTF8), but that is what was chosen at Stanford so we're stuck with it, which means remote sites that have become series subscribers are stuck with it too.

  4. Install the required DB-server languages:

    % createlang -h <db server host> -p 5432 -U postgres plpgsql data # Add the plpgsql language to the DRMS database
    % createlang -h <db server host> -p 5432 -U postgres plperl data # Add the plperl language to the DRMS database
    % createlang -h <db server host> -p 5432 -U postgres plperlu data # Add the plperlu 'unstrusted' language to the DRMS database

    At this time, there are no auxiliary languages needed for the SUMS database.

  5. Create various tables and DRMS database functions needed by the DRMS library. You will need the NetDRMS source code for this:

    # Create the 'admin' schema and tables within this schema; create the 'drms' schema.
    # Create the sumsadmin database user (which can delete rows from any DRMS data-series record DB table - used for the data-series archive == -1 feature); create the 'jsoc' database user.
    # In order to read from table _jsoc.sl_table (which may not exist), there must be a jsoc role
    # that has SELECT permissions on this table. If no _jsoc.sl_table exists, there is no need
    # for the jsoc role to exist, however its existence in the absence of a _jsoc.sl_table is
    # innocuous, so we always create role jsoc.
    # The database function drms_replicated() runs a query to read from _jsoc.sl_table.
    % psql -h <db server host> -p 5432 -U postgres data -f $JSOCROOT/base/drms/scripts/NetDRMS.sql

    # Create the PostgreSQL functions used by DRMS.
    % su postgres
    % cd $JSOCROOT/base/drms/scripts
    % ./createpgfuncs.pl data # Create functions in the DRMS database

  6. Create database accounts for DRMS users. To use DRMS software/modules, a user of this software must have an account on the DRMS database (a DRMS series is implemented as several database objects). The software, when run, will log into a user account on the DRMS database - by default, the name of the user account is the name of the linux user account that the DRMS software runs under.
    1. Run the newdrmsuser.pl script. This script, and some other Perl scripts that follow, have a dependency on the DBD::Pg Perl package. Please ensure this package has been installed before proceeding. When you run newdrmsuser.pl, you will be prompted for the postgres dbuser password:

      % $JSOCROOT/base/drms/scripts/newdrmsuser.pl data <db server host> 5432 <db user> <initial password> <db user namespace> user 1

      where <db user> is the name of the user whose account is to be created and <db user namespace> is the namespace DRMS should use when running as the db user and reading or writing database tables. DRMS uses <db user namespace> to store user-specific database information, including DRMS data series information owned by that user. The namespace is a logical container of database objects, like database tables, sequences, functions, etc. The names of all objects are qualified by the namespace. For example, to unambiguously refer to the table "mytable", you prepend the name with the namespace. So, for example, if this table is in the su_production namespace (container), then you refer to the table as "su_production.mytable". In this way, there can be other tables with the same name, but that reside in a different namespace (e.g., su_arta.mytable is a different table that just happens to have the same name). Please see the NOTE in this page for assistance with choosing a namespace. <initial password> is the initial password for this account. This is another useful place for you to test your .pgpass files if you have access to a home account for testing purposes, such as your own user account. You may have a mis-configuration in your pg_hba.conf file that would make it appear that .pgpass was not working.

    2. Have the user that owns the account change the password:

      % psql -h <db server host> -p 5432 data
      data=> ALTER USER <db user> WITH PASSWORD '<new password>';

      where <new password> is the replacement for the original password. It must be enclosed in single quotes.

    3. Have the user put their password in their .pgpass file. Please click here for information on the .pgpass file. This file allows the user to login to their database account without having to provide a password at a prompt.

    4. Create a db account for the linux production user (the name is the value of the SUMS_MANAGER parameter in config.local). The name of the database user for this linux user is the same as the name of the linux user (typically 'production'). Follow the previous steps to use newdrmsuser.pl to create this database account. A good namespace for this account is <drms site>_production - this is what you'd use for <db user namespace>.

    5. Create a password for the sumsadmin DRMS database user, following the "ALTER USER" directions above. The user was created by the NetDRMS.sql script above.
    6. OPTIONALLY, create a table to be used for DRMS version control:
      % psql -h <db server host> -p 5432 -U <postgres administrator> data
      data=> CREATE TABLE drms.minvers(minversion text default '1.0' not null);
      data=> GRANT SELECT ON drms.minvers TO public;
      data=> INSERT INTO drms.minvers(minversion) VALUES(<version>);
      where <version> is the minimum DRMS version that a DRMS module must have before it can connect to the DRMS database.

Set Up the SUMS database

  1. Although the SUMS data cluster and SUMS database have been already created, you must create certain tables and users in this newly created database.
    1. Create the production user in the SUMS database:

      % psql -h <db server host> -p 5434 data_sums -U postgres
      data_sums=# CREATE USER <db production user> PASSWORD '<password>';

    2. Create a read-only user in the SUMS database (so users can read the SUMS DB tables):

      % psql -h <db server host> -p 5434 data_sums -U postgres
      data_sums=# CREATE USER readonlyuser PASSWORD '<password>';
      data_sums=# GRANT CONNECT ON DATABASE data_sums TO readonlyuser;

    3. Put the DRMS production db user into the sumsadmin group:

      % psql -h <db server host> -p 5432 data -U postgres
      data=# GRANT sumsadmin TO <db production user>;

      sum_rm, when run properly by the linux production user, will attempt to connect to the DRMS database as <db production user>. By putting it into the sumsadmin DB user group, we are giving sum_rm the ability to delete any record in any DRMS data-series record table. This permission is required for the archive == -1 implementation; this is the feature that causes SUMS to delete DRMS records from series whose archive flag is -1 when the DRMS records' SUs are deleted.

    4. Put the production user's password into the .pgpass file. Please click here for information on the .pgpass file.

    5. Create the SUMS database tables:

      % psql -h <db server host> -p 5434 -U production -f base/sums/scripts/postgres/create_sums_tables.sql data_sums
      % psql -h <db server host> -p 5434 -U postgres data_sums
      data_sums=# ALTER SEQUENCE sum_ds_index_seq START <min val> RESTART <min val> MINVALUE <min val> MAXVALUE <max val>

      where <min val> is <drms site code> << 48, and and <max val> is <min val> + 281474976710655 (2^48 - 1), and <drms site code> is the value of the DRMS_SITE_CODE parameter in config.local.

    6. SUMS data files are organized into "partitions" which are implemented as directories. Each partition must be named /SUM[0-9]+ (e.g., /SUM0, /SUM1, ..., /SUM58, ..., /SUM99, /SUM100, /SUM101, ...). Each directory must be owned by the production linux user (e.g., "production"). The linux group to which the directories belong must be the SUMS-user group (set-up in step 1b. in the Users and Environment section, e.g. sumsuser). All SUMS users must be a member of this group. For example, if linux user art will be using DRMS and running DRMS modules that access SUMS files, then art must be a member of the SUMS user group (e.g., sumsuser). You are free to create as few or many of these partitions as you desire. Create these directories now.

      NOTE: Please avoid using file systems that limit the number of directories and/or files. For example, the EXT3 file system limits the number of directories to 64K. That number is far too small for SUMS usage.

    7. Initialize the sum_partn_avail table with the names of these partitions. For each SUMS partition run the following:

      % psql -h <db server host> -p 5434 -U postgres data_sums
      data_sums=# INSERT INTO sum_partn_avail (partn_name, total_bytes, avail_bytes, pds_set_num, pds_set_prime) VALUES ('<SUMS partition path>', <avail bytes>, <avail bytes>, 0, 0);

      where <SUMS partition path> is the full path of the partition (the path must be enclosed in single quotes) and <avail bytes> is some number less than the number of bytes in the directory (multiply the number of blocks in the directory by the number of bytes per block). The number does not matter, as long as it is not bigger than the total number of bytes available. SUMS will adjust this number as needed.

Test your Postgres database installations

  1. Make sure you as production and at least one other user name can log in to both the sums and drms database instances without a password prompt using psql and your .pgpass file.

  2. Do a \dt in both databases and check that you can see tables listed.
  3. Select * from sum_partn_avail table and make sure that your sums partitions are accurately entered.

Third Party Software for NetDRMS

You will need the following third party packages and main package libraries installed before compiling or the compilation will not work. Please note that these are examples from some successful installations, but your own machine may already be configured correctly or it may need an entirely different bunch of stuff installed to get to the same place. It's possible that even with the following installed, during your make you may see that you need further packages or libraries.

-- Development and standard package for postgres. Choose your version - this example shows packages for 9.3:
postgresql93.x86_64
postgresql93-devel.x86_64
postgresql93-libs.x86_64
postgresql93-plperl.x86_64
postgresql93-plpython.x86_64
postgresql93-pltcl.x86_64
postgresql93-server.x86_64

--Perl for scripts: V. 5.10 minimum; you may want development libraries installed. (Note that your OS may be relying on an old version of Perl and installing a new one directly on top of it may cause you strange and unexpected problems; parallel installation may be necessary.)

--Python, version 2.7 or higher (Note that some CentOS versions expect a lower version of Python for their own purposes, and installing directly on top of the existing Python may cause unexpected problems):
python33-python.x86_64
python33-python-devel.x86_64
python33-python-libs.x86_64

--Cfitsio development and standard packages:
cfitsio.x86_64
cfitsio-devel.x86_64

--OpenSSL, LibSSH2

--A compiler, choose either icc or gcc - you don't have to install these specific packages, these are only guides:
gcc.x86_64
libgcc.x86_64

--Development package and headers for C (gcc examples given here):
glibc-devel.x86_64
glibc-headers.x86_64

--Some compression stuff:
zlib.x86_64
zlib-devel.x86_64

--If you're going to be communicating regularly with the JSOC for replicated data, you may also need:
openssh.x86_64
openssh-clients.x86_64
openssh-server.x86_64
openssl.x86_64
openssl-devel.x86_64

--To build hpn-ssh for regular file exchange with JSOC:
See instructions on http://www.psc.edu/index.php/hpn-ssh , which will first instruct you to get the OpenSSH source code from OpenSSH.org. You will also need to install the "patch" package if it's not on your machine already, to put your hpn-ssh code together.

--If you're installing the JMD, you'll need Java installed along with its development library and the tools in tar.x86_64

Installing NetDRMS and SUMS

  1. Obtain a NetDRMS tarball from http://jsoc.stanford.edu/netdrms/dist/. Extract the tarball into <DRMS root>, a directory that serves as the root for the entire NetDRMS code tree, including binaries. A typical choice for <DRMS root> is /opt/netdrms-X.X or /usr/local/netdrms-X.X. To allow for future updates, it is best to make a link from /opt/netdrms to /opt/netdrms-X.X:

$ sudo mkdir -p /opt/netdrms-9.3
$ cd /opt/netdrms-9.3
$ sudo curl -O 'http://jsoc.stanford.edu/netdrms/dist/netdrms_9.3.tar.gz'
$ sudo tar xvzf netdrms_9.3.tar.gz
$ cd /opt
$ sudo ln -S /opt/netdrms-9.3 netdrms
  1. Create the config.local file, using <DRMS root>/config.local.template as a template. This file contains a number of configuration parameters, along with detailed descriptions of what they control and suggested values for those parameters. When installing NetDRMS updates, copy the existing and latest config.local to the new <DRMS root> and edit the copy as needed, using the new config.local.template in <DRMS root> to obtain information about parameters new to the newer release. Please request from the JSOC a value for DRMS_SITE_CODE. This code uniquely identifies each NetDRMS installation. Each site has one value for each NetDRMS installation.

  2. Compile NetDRMS. The DRMS part of NetDRMS must be compiled with a C compiler. NetDRMS supports both the GNU C compiler (gcc), and the Intel C++ compiler (icc). Certain JSOC-specific code requires Fortran compilation. For those projects, NetDRMS supports the GNU Fortran compiler (gfortran), and the Intel Fortran compiler (ifort). SUMS is implemented as a Python daemon, so no compilation step is needed. Both GNU and Intel are used, so feel free to use either. By default, Intel compilers are used. There are two methods for changing the compilers:
  3. you can set the following environment variables:
    • COMPILER - set to icc for the Intel compiler, and to gcc for the GNU compiler FCOMPILER - set to ifort for the Intel compiler, and to gfortran for the GNU compiler

  4. you can edit the make_basic.mk file in <DRMS root>; to select the Intel compilers, edit the COMPILER and FCOMPILER make variables declared near the top of the file:

# use Intel compilers
COMPILER = icc
FCOMPILER = ifort
# use GNU compilers
COMPILER = gcc
FCOMPILER = gfortran

A NetDRMS configuration script is part of the release and it resides in <JSOC root>. configure, a csh script, will run shell scripts and a Python script. It creates several directories in <DRMS root>:

  • bin - a directory that contains links to all executables in the DRMS code tree
  • include - a directory that contains links to all the header files in the DRMS code tree
  • jsds - a directory that contains links to all JSOC Series Definition (JSD) files
  • lib - a directory that contains links to all libraries in the DRMS code tree
  • localization - this directory contains project-specific make files and various files that contain the processed parameter information from config.local
  • scripts - a directory that contains links to all script files in the DRMS code tree

To make NetDRMS, run:

$ cd <DRMS root>
$ sudo ./configure

configure will place the resulting compiled files into a architecture-dependent directory in <DRMS root>. There are two possible names for this directory: _linux_x86_64 and _linux_avx (for hosts that support Advanced Vector Extensions). To see which directory to expect, you can run the jsoc_machine.csh csh script:

$ build/jsoc_machine.csh
linux_avx

Running SUMS Services

To launch the SUMS daemon, sumsd.py, use the start-mt-sums.py script:

$ sudo python3 start-mt-sums.py 

Performing a Test Run

At this point, it is a good idea to test your installation. Although you have no DRMS/SUMS data at this point, running show_series is a good way to test various components, like authentication, database connection, etc. To test SUMS, however, you will need to have a least one DRMS data series that has SUMS data. You can obtain such a data series by using the subscription system.

Test DRMS by running the show_series command:

$ show_series

If you see no errors, then life is good.

After you have a least one data series, then you can do more thorough testing. For example, you can run:

$ show_info -j <DRMS data series>

To test SUMS, you can run:

$ show_info -P <DRMS record-set specification>

To update to a newer NetDRMS release, simply create a new directory to contain the build, copy the previous config.local into the new <JSOC root> and edit it if new parameters have been added to config.local, and follow the directions for compiling DRMS. Any previous-release daemons that were running will need to be shut down, and the daemons in the newer release started.

Starting, Stopping and Testing SUMS

  1. Start SUMS:

    % $JSOCROOT/base/sums/scripts/sum_start.NetDRMS

    The script does not return a prompt after echoing "sum_svc now available". Just hit RETURN.

  2. To stop SUMS for any reason, run this script:

    % $JSOCROOT/base/sums/scripts/sum_stop.NetDRMS

  3. If both of these commands work and you find many sum_ processes in your list of active processes, you've been successful.

Deciding what's next

You may wish to run a JMD or use Remote SUMS. The decision should be discussed with JSOC personnel. Once you've made this decision and installed the appropriate software (see below for Remote SUMS), you'll need to populate your DRMS database with data. For this, you'll need to be a recipient of Slony subscription data. We recommend contacting the JSOC directly to become a subscriber.

Remote SUMS

A local NetDRMS may contain data produced by other, non-local NetDRMSs. Via a variety of means, the local NetDRMS can obtain and ingest the database information for these data series produced non-locally. In order to use the associated data files (typically image files), the local NetDRMS must download the storage units (SUs) associated with these data series too. There are currently two methods to facilitate these SU downloads. The Java Mirroring Daemon (JMD) is a tool that can be installed and configured to download SUs automatically as the series data records are ingested into the local NetDRMS. It fetches these SUs before they are actually used. It can obtain the SUs from any other NetDRMS that has the SUs, not just the NetDRMS that originally produced them. Remote SUMS is a built-in tool that comes with NetDRMS. It downloads SUs as needed - i.e., if a module or program requests the path to the SU or attempts to read it, and it is not present in the local SUMS yet, Remote SUMS will download the SUs. While the SUs are being downloaded, the initiating module or program will poll waiting for the download to complete.

Several components compose Remote SUMS. On the client side, the local NetDRMS, is a daemon that must be running (rsumsd.py). There also must exist some database tables, as well as some binaries used by the daemon. On the server side, all NetDRMS sites that wish to act as a source of SUs for the client, is a CGI (rs.sh). This CGI returns file-server information (hostname, port, user, SU paths, etc.) for the SUs the server has available in response to requests that contain a list of SUNUMs. When the client encounters requests for remote SUs that are not contained in the local SUMS, it requests the daemon to download those SUs. The client code then polls waiting for the request to be serviced. The daemon in turn sends requests to all rs.sh CGIs at all the relevant providing sites. The owning sites return the file-server information to the daemon, and then the daemon downloads the SUs the client has requested, via scp, and notifies the client module once the SUs are available for use. The client module will then exit from its polling code and continue to use the freshly downloaded SUs.

To use Remote SUMS, the config.local configuration file must first be configured properly, and NetDRMS must be re-built. Here are the relevant config.local parameters:

  • JMD_IS_INSTALLED - This must be set to 0 for Remote SUMS use. Currently, either the JMD or the Remote SUMS features can be used, but not both at the same time.
  • RS_REQUEST_TABLE - This is the database table used by the local module and the rsumsd.py daemon running at the local site for communicating SU-download requests. Upon encountering a non-native SUNUM, DRMS will insert a new record into this table to intiate a request for the SUNUM from the owning NetDRMS. The Remote SUMS daemon will service the request and update this record with results.
  • RS_SU_TABLE - This is the database table used by the Remote SUMS daemon to track SUs downloaded from the providing sites.
  • RS_DBHOST - This is the local database-server host that contains the database that contain the requests and SU tables.
  • RS_DBNAME - This is the database on the host that contains the requests and SU tables.
  • RS_DBPORT - This is the port on the local on which the database-server host accepts connections.
  • RS_DBUSER - This is the database user account that the Remote SUMS daemon uses to manage the Remote SUMS requests.
  • RS_LOCKFILE - This is the path to a file that ensures that only one Remote SUMS daemon instance runs.
  • RS_LOGDIR - This is the directory into which the Remote SUMS daemon logs are written.
  • RS_REQTIMEOUT - This is the timeout, in minutes, for a new SU request to be accepted for processing by the daemon. If the daemon encounters a request older than this value, it will reject the new request.
  • RS_DLTIMEOUT - This is the timeout, in minutes, for an SU to download. If the time the download takes exceeds this value, then all requests waiting for the SU to download will fail.
  • RS_MAXTHREADS - The maximum number of download threads that the Remote SUMS daemon is permitted to run simultaneously. One thread is one scp call.
  • RS_BINPATH - The NetDRMS-binary-path that contains the external programs needed by the Remote SUMS daemon (jsoc_fetch, vso_sum_alloc, vso_sum_put).

After setting-up config.local, you must build or re-build NetDRMS:

% cd $JSOCROOT
% configure
% make

It is important to ensure that three binaries needed by the Remote SUMS daemon have been built: jsoc_fetch, vso_sum_alloc, vso_sum_put.

Ensure that Python >= 2.7 is installed. You will need to install some package if they are not already installed: psycopg2, ...

An output log named rslog_YYYYMMDD.txt will be written to the directory identified by the RS_LOGDIR config.local parameter, so make sure that directory exists.

Provide all providing NetDRMS sites your public SSH key. They will need to put that key in their authorized_keys file.

Create the client-side Remote SUMS database tables. Run:

% $JSOCROOT/base/drms/scripts/rscreatetabs.py op=create tabs=req,su

Start the rsumsd.py daemon as the user specified by the RS_DBUSER config.local parameter. As this user, start an ssh-agent process and add the public key to it:

% ssh-agent -c > $HOME/.ssh-agent_rs
% source $HOME/.ssh-agent_rs
% ssh-add $HOME/.ssh/id_rsa

This will allow you to create a public-private key that has a passphrase while obviating the need to manually enter that passphrase when the Remote SUMS daemon runs scp.

Start SUMS:

% $JSOCROOT/base/sums/scripts/sum_start.NetDRMS >& <log dir>/sumsStart.log

Substitute your favorite log directory for <log dir>. There is another daemon, sums_procck.py, that keeps SUMS up and running once it is started. Redirecting to a log will preserve important information that this daemon prints. To stop SUMS, use $JSOCROOT/base/sums/scripts/sum_stop.NetDRMS.

Start the Remote SUMS daemon:

% $JSOCROOT/base/drms/scripts/rsumsd.py

Subscribing to Series

  • To learn about how your institution, using its NetDRMS installation, can maintain a mirror of DRMS data that receives real-time updates, click here.

JsocWiki: DRMSSetup (last edited 2024-01-19 09:08:03 by ArtAmezcua)