Differences between revisions 80 and 91 (spanning 11 versions)
Revision 80 as of 2015-07-15 10:18:54
Size: 44527
Editor: ArtAmezcua
Comment:
Revision 91 as of 2015-07-16 01:48:17
Size: 42721
Editor: ArtAmezcua
Comment:
Deletions are marked like this. Additions are marked like this.
Line 11: Line 11:
The initial installation of NetDRMS requires installing database software, adding one or more new users, allocating a fair bit of additional disk space for file storage, and installing, configuring and compiling the custom NetDRMS code. First, you will need to create a few linux users and groups, giving them the needed permissions (see step 1 below). Second, you will need to install the PostgreSQL Relational Database Management System (PG) and create two databases (see step 2 below). Third, you will need to establish disk storage for SUMS (see "Setting up a SUMS" below). Fourth, you will need to install third-party libraries needed by DRMS and SUMS (see X below). Fifth, you will need to build and install SUMS (see X below). The initial installation of NetDRMS requires installing database software, adding one or more new users, allocating a fair bit of additional disk space for file storage, and installing, configuring and compiling the custom NetDRMS code.

The entire NetDRMS system involves, from base to top:
 a. A couple instances of a database called Postgres, users, procedures and data tables within that database
 a. NetDRMS software written mainly in C, with some embedded Postgres calls and some Python v2.7 or higher. There are two pieces to this software: DRMS and SUMS. Each are compiled/made separately. It requires several third party libraries as well, such as cfitsio. math libraries, and mpi.
 a. If you want to receive replicated data from JSOC, you'll need to install some scripts, and work with your ssh keys and a software called hpn-ssh.
 a. If you want to be a distributor of data, you'll need to install a 'JMD' java/derby database system, third party libraries for tar and curl, and possibly Slony replication software.
 a. If you are a VSO installation, you'll need to run a web server and install further Perl code.

When installing NetDRMS, it is best to do it in a nested order, as listed above, and test each phase for success as you go. Don't move on to the next piece of the installation until reasonably assured that the software installed in the prior step works as planned.

First, you will need to create a few linux users and groups, giving them the needed permissions (see step 1 below). Second, you will need to install the PostgreSQL Relational Database Management System (PG) and create two databases (see step 2 below). Third, you will need to establish disk storage for SUMS (see "Setting up a SUMS" below). Fourth, you will need to install third-party libraries needed by DRMS and SUMS (see X below). Fifth, you will need to build and install SUMS (see X below).
Line 63: Line 74:
1. Compiling NetDRMS === Compiling NetDRMS ===
Line 66: Line 77:
You will need the following third party packages and main package libraries installed before compiling or the compilation will not work.

-- Development and standard package for postgres. Choose your version - this example shows packages for 9.3:
<<BR>>postgresql93.x86_64
<<BR>>postgresql93-devel.x86_64
<<BR>>postgresql93-libs.x86_64
<<BR>>postgresql93-plperl.x86_64
<<BR>>postgresql93-plpython.x86_64
<<BR>>postgresql93-pltcl.x86_64
<<BR>>postgresql93-server.x86_64

--Perl for scripts: V. 5.10 minimum; you may want development libraries installed.

--Python, version 2.7 or higher (note that some CentOS versions expect a lower version of Python for their own purposes, and installing directly on top of the existing Python may cause unexpected problems):
<<BR>>python33-python.x86_64
<<BR>>python33-python-devel.x86_64
<<BR>>python33-python-libs.x86_64
    
--Cfitsio development and standard packages:
<<BR>>cfitsio.x86_64
<<BR>>cfitsio-devel.x86_64

--A compiler, choose either icc or gcc - you don't have to install these specific packages, these are only guides:
<<BR>>gcc.x86_64
<<BR>>libgcc.x86_64

--Development and headers for C (gcc examples given here):
<<BR>>glibc-devel.x86_64
<<BR>>glibc-headers.x86_64

--Some compression stuff:
<<BR>>zlib.x86_64
<<BR>>zlib-devel.x86_64

--If you're going to be communicating regularly with the JSOC for replicated data, you may also need:
<<BR>>openssh.x86_64
<<BR>>openssh-clients.x86_64
<<BR>>openssh-server.x86_64
<<BR>>openssl.x86_64
<<BR>>openssl-devel.x86_64

--To build hpn-ssh:
<<BR>>See instructions on http://www.psc.edu/index.php/hpn-ssh , which will first instruct you to get the OpenSSH source code from OpenSSH.org.
You will also need to install the "patch" package if it's not on your machine already, to put your hpn-ssh code together.
   
--If you're installing the JMD, you'll need Java installed along with its development library and the tools in tar.x86_64

=== Now, on to compiling NetDRMS and SUMS ===
Line 90: Line 150:
To make the SUMS server available, the SUMS manager (only) needs to run ''make sums'' in the DRMS root directory. This only needs to be done once for the system; individual users do not need to do it. At this point, if you are the SUMS manager, you are ready to proceed with the configuration, build and start of SUMS services. Proceed to the SUMS setup instructions. Otherwise you are ready to go. Please note that you will see many, many warning messages as NetDRMS and SUMS compile. Pages and pages of warnings will likely appear. Unless you have an error, you should be okay to proceed.
Line 94: Line 152:
1. Building and installing SUMS: === Building and installing SUMS ===

To make the SUMS server available, follow steps below, or the SUMS manager (only) needs to run ''make sums'' in the DRMS root directory. This only needs to be done once for the system; individual users do not need to do it. At this point, if you are the SUMS manager, you are ready to proceed with the configuration, build and start of SUMS services. Proceed to the SUMS setup instructions. Otherwise you are ready to go. Please note that you will see many, many warning messages as NetDRMS and SUMS compile. Pages and pages of warnings will likely appear. Unless you have an error, you should be okay to proceed.
Line 97: Line 157:
 1. Copy the sum_chmown program to <path to sum_chmown> (chosen in step 1a. above), make the production user the owner, and give it setuid privileges:<<BR>><<BR>>{{{su - root}}}<<BR>>{{{cp $JSOCROOT/drms/_linux_x86_64/base/sums/apps/sum_chmown <path to sum_chmown>}}}<<BR>>{{{chown root:root <path to sum_chmown>}}}<<BR>>{{{chmod u+s <path to sum_chmown>}}}<<BR>><<BR>>  1. Copy the sum_chmown program to <path to sum_chmown> (chosen in step 1a. above), make the production user the owner, and give it setuid privileges:<<BR>><<BR>>{{{su - root}}}<<BR>>{{{cp $JSOCROOT/drms/_linux_x86_64/base/sums/apps/sum_chmown <path to sum_chmown>}}}<<BR>>{{{chown root:root <path to sum_chmown>}}}<<BR>>{{{chmod u+s <path to sum_chmown>}}}<<BR>><<BR>>  Note: some sites have made this program into a program that does nothing when called. These sites have only one user that writes files to sums, however, and need not be concerned about different users with different permissions writing files to sums.
Line 101: Line 161:


old stuff below

== Building Your Own DRMS and SUMS ==
Sites other than the JSOC can DRMS data series. They can maintain local copies of the DRMS and SUMS data created at the JSOC. And they can create their own DRMS data, of which other sites can maintain local copies. To participate in this network of sites sharing data, a site (aka a node) must install a DRMS/SUMS system to become a NetDRMS site. Once a member of a this network, a NetDRMS site can selectively share specific data series - it is not necessary to share all series.

There are three fundamental requiremants for setting up and operating a DRMS system:

 * Reserved disk space to serve as the SUMS disk cache.
 * A database server running Postgres version 8.4.
 * A "current" copy of the JSOC software tree, available from Stanford.

== Setting up a SUMS ==
The SUMS disk area can be as simple as a directory, but it is probably better to assign at least one disk partition to the SUMS cache. Unless a tape library also exists, the SUMS partition(s) must be large enough to store all the data segments in the DRMS that are to be archived locally. For datasets for which other DRMS servers provide the permanent archive, the local SUMS will serve only as a local cache, so size is dictated by expected usage.

The directory or directories to be used for SUMS must be owned by a user named '''production''' (can be any uid) and belong to a group named '''SOI''' (can be any gid), and have a permissions mask of 8354 (''drwxrwsr-x''). The group '''SOI''' should include as members any users who will be writing data into the DRMS by running modules or otherwise.

== Setting up the Postgres Database server ==
You should have Postgres Version 8.1 or higher installed; JSOC database servers are currently (Oct 2006) running on the following systems:

 * a 64-bit dual-core xeon running Red Hat Enterprise Linux 4 with Postgres v. 8.1.2
 * a 32-bit dual-core pentium 4 running Scientific Linux (?; equinox) with Postgres v. 8.1.4

== Populating the Database ==
First, you must create the database tables required for SUMS. You can do so by running the following psql commands:

{{{
create table SUM_MAIN (
 ONLINE_LOC VARCHAR(80) NOT NULL,
 ONLINE_STATUS VARCHAR(5),
 ARCHIVE_STATUS VARCHAR(5),
 OFFSITE_ACK VARCHAR(5),
 HISTORY_COMMENT VARCHAR(80),
 OWNING_SERIES VARCHAR(80),
 STORAGE_GROUP integer,
 STORAGE_SET integer,
 BYTES bigint,
 DS_INDEX bigint,
 CREATE_SUMID bigint NOT NULL,
 CREAT_DATE timestamp(0),
 ACCESS_DATE timestamp(0),
 USERNAME VARCHAR(10),
 ARCH_TAPE VARCHAR(20),
 ARCH_TAPE_POS VARCHAR(15),
 ARCH_TAPE_FN integer,
 ARCH_TAPE_DATE timestamp(0),
 WARNINGS VARCHAR(260),
 STATUS integer,
 SAFE_TAPE VARCHAR(20),
 SAFE_TAPE_POS VARCHAR(15),
 SAFE_TAPE_FN integer,
 SAFE_TAPE_DATE timestamp(0),
 constraint pk_summain primary key (DS_INDEX)
);

create table SUM_OPEN (
    SUMID bigint not null,
    OPEN_DATE timestamp(0),
    constraint pk_sumopen primary key (SUMID)
);

create table SUM_PARTN_ALLOC (
    wd VARCHAR(80) not null,
    sumid bigint not null,
    status integer not null,
    bytes bigint,
    effective_date VARCHAR(20),
    archive_substatus integer,
    group_id integer,
    ds_index bigint not null,
    safe_id integer
);

create table SUM_PARTN_AVAIL (
       partn_name VARCHAR(80) not null,
       total_bytes bigint not null,
       avail_bytes bigint not null,
       pds_set_num integer not null,
       constraint pk_sumpartnavail primary key (partn_name)
);

create table SUM_TAPE (
        tapeid varchar(20) not null,
        nxtwrtfn integer not null,
        spare integer not null,
        group_id integer not null,
        avail_blocks bigint not null,
        closed integer not null,
        last_write timestamp(0),
        constraint pk_tape primary key (tapeid)
);

create sequence SUM_SEQ
  increment 1
  start 2
  no maxvalue
  no cycle
  cache 50;

create sequence SUM_DS_INDEX_SEQ
  increment 1
  start 1
  no maxvalue
  no cycle
  cache 10;

create table SUM_FILE (
        tapeid varchar(20) not null,
        filenum integer not null,
        gtarblock integer,
        md5cksum varchar(36) not null,
        constraint pk_file primary key (tapeid, filenum)
       );

create table SUM_GROUP (
        group_id integer not null,
        retain_days integer not null,
        effective_date VARCHAR(20),
        constraint pk_group primary key (group_id)
       );
}}}
(These are contained in the scripts '''create_tables.sql''', '''sum_file.sql''', and '''sum_group.sql''' in the JSOC software library '''base/sums/scripts/postgres'''.) For example, if you have created a database named ''mydb'' on a server named ''myserver'' (and had one of those scripts in your ''wd''), you could enter the command

{{{
  psql -h myserver mydb -f create_tables.sql
}}}
Or you could simply enter the commands by hand. (You should be the database administrator when you create these tables.)

NetDRMS - a shared data management system

Introduction

In order to process, archive, and distribute the substantial quantity of data flowing from the Atmospheric Imaging Assembly (AIA) and Helioseismic and Magnetic Imager (HMI) instruments on the Solar Dynamics Observatory (SDO), the Joint Science Operations Center (JSOC) has developed its own data management system. This system, the Data Record Management System (DRMS), consists of data series, each of which is a collection of related data. For example, there exists a data series named hmi.M_45s, which contains the HMI 45-second cadence magnetograms. Each data series consists of several DRMS objects: records, keywords, segments, and links. A DRMS record is the smallest unit of data-series data. Typically, it represents data for a single observation in time (hence the term series in data series), but there is no restriction on how a user organizes their data. A data series may contain one or more DRMS keywords, each of which represents a named bit of metadata. For example, many data series contain a DRMS keyword named CRPIX1. A DRMS segment is a collection of data that contains storage/retrieval information needed by DRMS to locate auxiliary data files. These data files contain large sets of data like image arrays. Generally, they are image files, but what they contain is arbitrary and user-defined. A data series optionally contains one or more DRMS links, each of which is a collection of data that links the data series to other DRMS data series. Each DRMS record contains record-specific values for the DRMS keywords, segments, and links. In this way, one record may have one set of keyword, segment, and link values, and another record may have a different set of these values.

The Storage Unit Management System (SUMS) is the file-management system that contains the data files that DRMS records refer to. Each DRMS segment value is used by DRMS code to derive the SUMS file-system path to a single data file. Because each DRMS series may contain multiple DRMS segments, each DRMS record may point to more than one data file.

To manage all these data, DRMS comprises several components, one of which is a database instance in a relational-database management system (PostgreSQL). The DRMS Library code uses a database instance and several tables to implement the DRMS objects. For each data-series record, there exists a database table that contains one row per each DRMS record. The columns of each of these records contain the DRMS keyword, segment, and link values - bits of data that are all small enough to efficiently fit in a database record. The data-file data are too large to fit into a database record, so those data reside in data files in SUMS. The DRMS-segment values point to the data files, using a unique identifier called a SUNUM. SUMS itself comprises several components, one of which is another database instance that contains several database tables. When DRMS needs a data file, it requests the file from SUMS by providing SUMS with a SUNUM, and then SUMS consults its database tables to derive the path to the data file. SUMS shuttles files between hard disk (aka the disk cache) and tape, so data files have no permanent file path. Therefore, when DRMS requests the path to a file, SUMS must obtain the current path by consulting a database table.

Installing NetDRMS

Installing NetDRMS for the First Time

The initial installation of NetDRMS requires installing database software, adding one or more new users, allocating a fair bit of additional disk space for file storage, and installing, configuring and compiling the custom NetDRMS code.

The entire NetDRMS system involves, from base to top:

  1. A couple instances of a database called Postgres, users, procedures and data tables within that database
  2. NetDRMS software written mainly in C, with some embedded Postgres calls and some Python v2.7 or higher. There are two pieces to this software: DRMS and SUMS. Each are compiled/made separately. It requires several third party libraries as well, such as cfitsio. math libraries, and mpi.
  3. If you want to receive replicated data from JSOC, you'll need to install some scripts, and work with your ssh keys and a software called hpn-ssh.
  4. If you want to be a distributor of data, you'll need to install a 'JMD' java/derby database system, third party libraries for tar and curl, and possibly Slony replication software.
  5. If you are a VSO installation, you'll need to run a web server and install further Perl code.

When installing NetDRMS, it is best to do it in a nested order, as listed above, and test each phase for success as you go. Don't move on to the next piece of the installation until reasonably assured that the software installed in the prior step works as planned.

First, you will need to create a few linux users and groups, giving them the needed permissions (see step 1 below). Second, you will need to install the PostgreSQL Relational Database Management System (PG) and create two databases (see step 2 below). Third, you will need to establish disk storage for SUMS (see "Setting up a SUMS" below). Fourth, you will need to install third-party libraries needed by DRMS and SUMS (see X below). Fifth, you will need to build and install SUMS (see X below).

To install NetDRMS and SUMS, please follow these directions in order. All accounts/paths/ports/etc. referenced can be modified, but we recommend not doing this unless you are certain they must be different. Debugging issues from Stanford becomes difficult if every site does things differently. The accounts/paths/ports/etc. listed below are the ones used on Stanford's test NetDRMS (on the machine "shoom").

  1. Download the NetDRMS Distribution. This is a gzipped tarfile. Unpack it into a target root directory of your choice, e.g. /usr/local/drms or $HOME/drms or /opt/netdrms. The size of the source distribution is currently about 10 MB. A built system (including SUMS) is typically about 300 MB. In the target root directory (hereinafter referred to as $DRMS), you must supply a config.local file describing your site configuration.

You may wish to create a sim link to the NetDRMS directory. E.g. your code is really in /opt/netdrms87/, but you have a link for /opt/netdrms/ that points to whatever your most current NetDRMS code directory is. This will facilitate updates without recoding environment variables. Once you've decided where to put the code, untar it and have a look at it. In particular, read the config.local.template file. You will need to copy and rename and then adjust this file (as config.local) accordingly for your own site but do that later. For now, read config.local.template since these installation instructions reference its variables often.

When you do create your config.local file, it is a good idea to save a copy in a directory outside your $DRMS directory; the SUMS_LOG_BASEDIR would be a good place to keep it if you are the SUMS_MANAGER.

Bear in mind that you may have to change the ownership and permissions on the $DRMS directory as you go through the install process and determine the user that will run the code.

  1. Set up your existing linux environment to accept NetDRMS (to be done by a superuser or someone with sudo privileges)
    1. Create a production linux user (named production by default). The name of this user is the value of the SUMS_MANAGER parameter in the config.local file. If necessary, modify the sudoers file to include the name of the production user so that this user has the privileges necessary to run a setuid program, sum_chmown, that is part of the SUMS-installation package:

      <production user> <host>=NOPASSWD:<path to sum_chmown>

      This will allow sum_chmown to be run without a password prompt being presented. Other sites have configured their production user to have highly specific ownership permissions as an alternative to giving the user sudo privileges, and nullified the sum_chmown script since all their data is written only by one user.

    2. Create a linux group to which the production user belongs, e.g. sumsadmin. All users who will be using the NetDRMS system to access or create SUMS data files must also belong to this group.

    3. Ensure that the production user can connect to the database without being prompted for a password. To do this, create a .pgpass file in the production user's home directory. Please click here for information on the .pgpass file. It is important that the permissions for the .pgpass file are set to 600, readable only to the individual user. You may wish to wait on this step until you install Postgres - you will need to adjust your pg_hba.conf settings in Postgres in order for the .pgpass file to correctly work.

    4. Create a linux user named "postgres". This is the user that will own all of the Postgres data files. It is also the user that will run the server daemon process (postgres).
    5. Each user of DRMS, including the production user, must set two environment variables in their environment:

      setenv JSOCROOT <DRMS source tree root>
      setenv JSOC_MACHINE <OS and CPU>

      where <DRMS source tree root> is the root of the DRMS source tree installed by the production linux user, and <OS and CPU> is "linux_x86_64", if DRMS was installed on a machine with a Linux OS and a 64-bit processor, or "linux_avx", if DRMS was installed on a machine with a Linux OS and a 64-bit processor that supports Advanced Vector Extensions (which supports an extended instruction set). Again, you may wish to have the NetDRMS software installed and compiled before you put the $JSOC_MACHINE variable into play.

    6. Create the SUMS log directory on the SUMS server machine, if it does not already exist. The name/path for this directory is defined in config.local in the SUMS_LOG_BASEDIR parameter. The actual directory must match the value of this parameter, which defaults to /usr/local/logs/SUM. You are free to change this path in SUMS_LOG_BASEDIR. This directory must be writeable by the linux production user.

  2. Set up the Postgres database.
    1. Install server version 8.4 (this is the only version supported by Stanford) on a dedicated machine. Obtain the latest 8.4 rpm binaries from ftp://ftp.postgresql.org/pub/binary/. You can install later versions of Postgres, up to v.9.3 have been proven at other data sites, if you are not going to become a provider or generator of slony data. Slony is the database replication software that is used by the JSOC at Stanford to distribute records, and its version is tied to Postgres 8.4.x presently.

    2. Install the client software, version 8.4 or your chosen server version, on all machines that will be used to either access the database server or build DRMS software. All DRMS software must connect to the DRMS and SUMS databases. To do so, it must be linked against static and/or dynamic libraries that allow database access. These libraries are a component of the Postgres client software, so it must be installed on machines used to build DRMS software. Some dynamic libraries are involved, so the host on which this software is run must also have the Postgres client software installed.
    3. Create a database cluster for the DRMS data. A database cluster is a storage area on disk that contains the data for one or more databases. The storage area is implemented as a directory (the data directory) and it is managed by a single instance of a Postgres server process. To create this cluster (data directory), first log-in as the linux user postgres, and then run the initdb command:

      initdb --locale=C -D /var/lib/pgsql/data

      This will create the data directory /var/lib/pgsql/data on the database server host. If you want to place the data in a different directory, go right ahead and change the -D parameter value. The "--locale" argument will set cluster locale to "C". Locale support refers to an application respecting cultural preferences regarding alphabets, sorting, number formatting, etc. PostgreSQL uses the standard ISO C and POSIX locale facilities provided by the server operating system. We recommend "C" and make no guarantees what will happen to your formatting if you deviate.

    4. Create a database cluster for the SUMS data. This cluster is distinct from the cluster for the DRMS data, and it is maintained by a separated server instance:

      initdb --locale=C -D /var/lib/pgsql/data_sums

      This will create the data directory /var/lib/pgsql/data_sums on the database server host (or wherever you've decided to put the cluster with the -D parameter).

    5. Edit the Postgres configuration files - you will have these in two different places, one for each initdb you created. The configuration files are cluster-specific, and they reside in the data directory created by the initdb command. These are the key parameters which will determine your database efficiency and security. A complete list of all modifiable parameters can be found in the Postgres online documentation, but a couple are worth mentioning now.
      1. listen_addresses (in postgresql.conf) is a list of IP addresses from which connections can be made. By default the value of the parameter is "localhost", which disallows IP connections from all machines, except the machine hosting the database server process. This is not what you want. The single-quoted string '*' will allow connections from all machines. If you want to be more restrictive, you can simply provide a comma-separated list of hostnames or IP addresses.
      2. port (in postgresql.conf) is the port on which the server listens for connections. If you create more than one cluster on the host server machine (e.g., if you create both the DRMS and SUMS clusters on a single host), then you'll need to change the port number for at least one cluster (you cannot have two server processes listening for connections on the same port). We suggest using port 5432 for the DRMS cluster (port = 5432 - no quotes), and port 5434 for the SUMS cluster. Note that port 5432 is the default port for Postgres.
      3. logging_collector (in postgresql.conf). Set this to 'on' so that the output of the Postgres server process will be captured into log files and rotated once per day.
      4. log_rotation_size (in postgresql.conf). Set this to 0. This will cause PG to emit one log every day (as opposed to starting a new log after the previous log is a certain size).
      5. log_min_duration_statement (in postgresql.conf). Set this to 1000 so that only queries that are greater than 1000 ms in run time will be logged. Otherwise, the log files will quickly get out of hand.
      6. shared_buffers. Set this to more than the 128 MB as default. This is how much memory you want the database to give your processes, and most machines have more memory to devote than the default. You may also wish to adjust the values for work_mem, maintenance_work_mem, and max_stack_depth, but consult the Postgres manual for a better understanding.
      7. The pg_hba.conf file. This file contains lines of the form

        <connection type>  <databases>  <user>  <IP address>  <IP mask>  <authentication method>

        if you wish to use an IP-address mask to specify a range of IP addresses, or

        <connection type>  <databases>  <user>  <CIDR-address>  <authentication method>

        if you wish to use a CIDR-address to specify the range. To get yourself up and running, you'll need to add a line or two to this file. To allow access by one host, we suggest

        host  all  all XXX.XXX.XXX.XXX  255.255.255.255  md5

        or

        host  all  all XXX.XXX.XXX.XXX/32  md5

        For multiple-host access, we suggest

        host  all  all XXX.XXX.XXX.0  255.255.255.0  md5

        or

        host  all  all  XXX.XXX.XXX.0/24  md5 The md5 encryption is what will trigger the use of user .pgpass files. You may also wish to comment out the line "local local trust" - this line allows anyone on the local machine to log in with no password, and isn't secure. Once you've commented out the "local local trust" line, you will no longer be able to log in without a .pgpass file correctly made. Please note that whenever you make changes to pg_hba.conf, you will need restart the database server to have changes take effect.

  3. The remainder of the instructions require that the Postgres servers (there is one for the DRMS cluster, and one for the SUMS cluster) be running. To start-up the server instances run:

    su postgres
    pg_ctl start -D /var/lib/pgsql/data # start the DRMS-database cluster server
    pg_ctl start -D /var/lib/pgsql/data_sums -o "-p 5434" # start the SUMS-database cluster server.

    The server logs will be placed in the pg_log subdirectory for each cluster.

  4. Create the DRMS database in the DRMS cluster, and create the SUMS database in the SUMS cluster:

    su postgres
    createdb --locale C -E LATIN1 -T template0 data # create the DRMS database in the DRMS-database cluster
    createdb --locale C -E LATIN1 -T template0 -p 5434 data_sums # create the SUMS database in the SUMS-database cluster. NOTE: The -E flag sets the character encoding of the characters stored in the database. LATIN1 is not a great choice (it would have been better to have used SQL_ASCII or UTF8), but that is what was chosen at Stanford so we're stuck with it, which means remote sites that have become series subscribers are stuck with it too.

  5. Install the required DB-server languages:

    createlang -h <db server host> -p 5432 -U postgres plpgsql data # Add the plpgsql language to the DRMS database
    createlang -h <db server host> -p 5432 -U postgres plperl data # Add the plperl language to the DRMS database
    createlang -h <db server host> -p 5432 -U postgres plperlu data # Add the plperlu 'unstrusted' language to the DRMS database

    At this time, there are no auxiliary languages needed for the SUMS database.

  6. Create various tables and DRMS database functions needed by the DRMS library. You will need the NetDRMS source code for this:

    psql -h <db server host> -p 5432 -U postgres data -f $JSOCROOT/base/drms/scripts/NetDRMS.sql # Create the 'admin' schema and tables within this schema; create the 'drms' schema
    # Create the SUMSADMIN database user
    su postgres
    cd $JSOCROOT/base/drms/scripts
    ./createpgfuncs.pl data # Create functions in the DRMS database

  7. Create database accounts for DRMS users. To use DRMS software/modules, a user of this software must have an account on the DRMS database (a DRMS series is implemented as several database objects). The software, when run, will log into a user account on the DRMS database - by default, the name of the user account is the name of the linux user account that the DRMS software runs under.
    1. Run the newdrmsuser.pl script - you will be prompted for the postgres dbuser password:

      $JSOCROOT/base/drms/scripts/newdrmsuser.pl data <db server host> 5432 <db user> <initial password> <db user namespace> user 1

      where <db user> is the name of the user whose account is to be created and <db user namespace> is the namespace DRMS should use when running as the db user and reading or writing database tables. The namespace is a logical container of database objects, like database tables, sequences, functions, etc. The names of all objects are qualified by the namespace. For example, to unambiguously refer to the table "mytable", you prepend the name with the namespace. So, for example, if this table is in the su_production namespace (container), then you refer to the table as "su_production.mytable". In this way, there can be other tables with the same name, but that reside in a different namespace (e.g., su_arta.mytable is a different table that just happens to have the same name). Please see the NOTE in this page for assistance with choosing a namespace. <initial password> is the initial password for this account.

    2. Have the user that owns the account change the password:

      psql -h <db server host> -p 5432 data
      data=> ALTER USER <db user> WITH PASSWORD '<new password>';

      where <new password> is the replacement for the original password. It must be enclosed in single quotes.

    3. Have the user put their password in their .pgpass file. Please click here for information on the .pgpass file. This file allows the user to login to their database account without having to provide a password at a prompt. As you come to this point, it would be wise to test that your own logins work with your .pgpass file. You may have a mis-configuration in your pg_hba.conf file that would make it appear that .pgpass was not working.

    4. Create a db account for the linux production user (the name is the value of the SUMS_MANAGER parameter in config.local). The name of the database user for this linux user is the same as the name of the linux user (typically 'production'). Follow the previous steps to create this database account.
    5. Create a password for the sumsadmin DRMS database user, following the "ALTER USER" directions above. The user was created by the newdrmsuser.pl script above.
    6. OPTIONALLY, create a table to be used for DRMS version control:
      psql -h <db server host> -p 5432 -U <postgres administrator> data
      CREATE TABLE drms.minvers(minversion text default '1.0' not null);
      GRANT SELECT ON drms.minvers TO public;
      INSERT INTO drms.minvers(minversion) VALUES(<version>);
      where <version> is the minimum DRMS version that a DRMS module must have before it can connect to the DRMS database.

  8. Set-up the SUMS database. Although the SUMS data cluster and SUMS database have been already created, you must create certain tables and users in this newly created database.
    1. Create the production user in the SUMS database:

      $JSOCROOT/base/drms/scripts/newdrmsuser.pl data_sums <db server host> 5434 <db production user> <password> <db production user namespace> sys 1

      where <db production user namespace> is the namespace. Please see the NOTE in this link for assistance with choosing a namespace for the production user.

    2. Put the production db user into the sumsadmin group:

      psql -h <db server host> -p 5432 data -U postgres
      postgres=> GRANT sumsadmin TO <db production user>;

    3. Put the production user's password into the .pgpass file. Please click here for information on the .pgpass file.

    4. Create the SUMS database tables:

      psql -h <db server host> -p 5434 -U production -f scripts/create_sums_tables.sql data_sums
      ALTER SEQUENCE sum_ds_index_seq START <min val> RESTART <min val> MINVALUE <min val> MAXVALUE <max val>

      where <min val> is <drms site code> << 48, and and <max val> is <min val> + 281474976710655 (2^<drms site code> - 1), and <drms site code> is the value of the DRMS_SITE_CODE parameter in config.local.

    5. Grant elevated privileges to these tables to the db production user (the scripts should be modified to do this):

      psql -h <db server host> -p 5434 -U postgres data_sums
      data_sums=> GRANT ALL ON sum_tape TO production;
      data_sums=> GRANT ALL ON sum_ds_index_seq,sum_seq TO production;
      data_sums=> GRANT ALL ON sum_file,sum_group,sum_main,sum_open TO production;
      data_sums=> GRANT ALL ON sum_partn_alloc,sum_partn_avail TO production;

    6. SUMS data files are organized into "partitions" which are implemented as directories. Each partition must be named /SUM[0-9]* (e.g., /SUM, /SUM0, /SUM101). Each directory must be owned by the production linux user (e.g., "production). The file-system group to which the directories belong, the SUMS user group (e.g., SOI) must also contain all DRMS users. So, if linux user art will be using DRMS and running DRMS modules, then art must be a member of the SUMS user group. You are free to create as few or many of these partitions as you desire. Create these directories now.

      NOTE: Please avoid using file systems that limit the number of directories and/or files. For example, the EXT3 file system limits the number of directories to 64K. That number is far too small for SUMS usage.

    7. Initialize the sum_partn_avail table with the names of these partitions. For each SUMS partition run the following:

      psql -h <db server host> -p 5434 -U postgres data_sums
      data_sums=> INSERT INTO sum_partn_avail (partn_name, total_bytes, avail_bytes, pds_set_num, pds_set_prime) VALUES ('<SUMS partition path>', <avail bytes>, <avail bytes>, 0, 0);

      where <SUMS partition path> is the full path of the partition (the path must be enclosed in single quotes) and <avail bytes> is some number less than the number of bytes in the directory (multiply the number of blocks in the directory by the number of bytes per block). The number does not matter, as long as it is not bigger than the total number of bytes available. SUMS will adjust this number as needed.

Compiling NetDRMS

The configuration and compilation of NetDRMS described here can proceed largely independently of the site and/or user setup, which only needs to be done once. It is recommended that the site setup be done first, as the NetDRMS build requires the definition of certain site-dependent names, such as those of the database and server; however, if these names are already known, the libraries can be built without the database and SUMS storage in place. Any code that requires access to the database will not of course function until the DRMS and SUMS services have been set up.

You will need the following third party packages and main package libraries installed before compiling or the compilation will not work.

-- Development and standard package for postgres. Choose your version - this example shows packages for 9.3:
postgresql93.x86_64
postgresql93-devel.x86_64
postgresql93-libs.x86_64
postgresql93-plperl.x86_64
postgresql93-plpython.x86_64
postgresql93-pltcl.x86_64
postgresql93-server.x86_64

--Perl for scripts: V. 5.10 minimum; you may want development libraries installed.

--Python, version 2.7 or higher (note that some CentOS versions expect a lower version of Python for their own purposes, and installing directly on top of the existing Python may cause unexpected problems):
python33-python.x86_64
python33-python-devel.x86_64
python33-python-libs.x86_64

--Cfitsio development and standard packages:
cfitsio.x86_64
cfitsio-devel.x86_64

--A compiler, choose either icc or gcc - you don't have to install these specific packages, these are only guides:
gcc.x86_64
libgcc.x86_64

--Development and headers for C (gcc examples given here):
glibc-devel.x86_64
glibc-headers.x86_64

--Some compression stuff:
zlib.x86_64
zlib-devel.x86_64

--If you're going to be communicating regularly with the JSOC for replicated data, you may also need:
openssh.x86_64
openssh-clients.x86_64
openssh-server.x86_64
openssl.x86_64
openssl-devel.x86_64

--To build hpn-ssh:
See instructions on http://www.psc.edu/index.php/hpn-ssh , which will first instruct you to get the OpenSSH source code from OpenSSH.org. You will also need to install the "patch" package if it's not on your machine already, to put your hpn-ssh code together.

--If you're installing the JMD, you'll need Java installed along with its development library and the tools in tar.x86_64

Now, on to compiling NetDRMS and SUMS

These instructions assume that there is already a NetDRMS database server and associated SUMS server that you can connect to. If that is not the case, then you or someone else at your site will first have to do a Site Installation (above). You must also have the PostgreSQL Core installed at least as a client library on any machine on which you intend to build the package. You should have psql in your path.

If you have not already done so, download the NetDRMS Distribution. This is a gzipped tarfile. Unpack it into a target root directory of your choice, e.g. /usr/local/drms, /opt/netdrms/ or $HOME/drms. In the target root directory (hereinafter referred to as $DRMS), you must supply a config.local file describing your site configuration. If V 2.7 or higher has been installed by your site administrator, you should simply copy or link to their version of the file. For site administrators:

If you had not previously installed a V 2.7 release or higher, you should create the config.local file fresh. You can do so either by copying one from the file config.local.template and editing it to supply the appropriate values, or by running the perl script netdrms_setup.pl which will walk you through the fields. (That script has not been widely tested, and might require some tweaking. In particular it tries to execute some additional scripts at the end that are not yet in the release.)

Most of the entries in the file should be self-explanatory. It is essential that the first variable, LOCAL_CONFIG_SET be changed from NO or commented out. Other variables that are almost certain to require changes are DBSERVER_HOST, DRMS_DATABASE, SUMS_SERVER_HOST, and DRMS_SITE_CODE. If you intend to export as well as import data, your DRMS_SITE_CODE must be registered. See the site code page for a list of currently assigned codes.

However you create your config.local file, as previously stated, it is a good idea to save a copy in a directory outside your $DRMS directory; the SUMS_LOG_BASEDIR would be a good place to keep it if you are the SUMS_MANAGER. Other users' config.local files should match that of the SUMS_MANAGER in any case. In the target root directory $DRMS, run

  • /configure

This simply builds a set of links for include files, man pages, scripts, and jsd (JSOC Series Descriptor) files in common subdirectories below the root. Note that it is a csh script. If you do not have csh or tcsh installed on your system, you will have to make those links yourself. (Chances are that you will have to perform the whole site configuration by hand.) The NetDRMS distribution is currently supported for three target architectures under Linux, named (by default): linux_ia32 (uname -s = Linux, uname -m = ia32 | i686 | i386) linux_x86_64 (uname -s = Linux, uname -m = x86_64) and linux_avx. The distribution has been built on both Enterprise Linux versions 4 and 5. Enterprise 5, has a system bug that needs to be fixed in order to build the SUMS server (it does not affect the DRMS client.) See platform notes for instructions on how to fix this bug.

If you are making on any other architecture, the target name will be custom. Binaries and libraries will be placed in appropriate subdirectories based on these names. If you will be making on multiple architectures, or if you wish to change the target architecture name, you should either add the following line near the beginning of the file $DRMS/make_basic.mk

  • JSOC_MACHINE = name

or set your environment variable JSOC_MACHINE to name before running the make. The latter is recommended for future use, so that you can set appropriate paths in your login or shell initialization scripts. If necessary, edit the file $DRMS/make_basic.mk to set your compiler options. The default compilers for Linux are the Intel compiler icc and ifort if available; otherwise gcc and gfortran. If you prefer to use different compilers, change the following two lines in the file accordingly:

  • COMPILER = icc FCOMPILER= ifort

Note that the DRMS Fortran API requires a Fortran 90 compiler. The Fortran compiler is only required if you wish to build Fortran modules that will link against the DRMS library; nothing in the DRMS and SUMS internals and applications uses Fortran. Besides ifort, the gfortran43 compiler should work; there may be a problem with f95. Note that you can only build on a system on which the Postgres SQL Client Applications libraries exist (e.g. libecpg.a). You will also require the OpenSSL secure sockets toolkit; You should have a /usr/include/openssl directory or equivalent on your system where the compiler can locate it by default. N.B. If you are using the icc compiler, it is recommended to use version 11 . There are some very nasty bugs using version 10.*. In the root directory $DRMS, type make. If all goes well, the directory $DRMS/bin/arch_name will be created and filled, likewise the library directory $DRMS/lib/arch_name. If you are building on multiple architectures, repeat this step on each one, being careful to observe the rules in the previous three steps. These instructions should suffice for all users except the manager who needs to initialize the database and/or start the SUMS server. If you do not need to start a SUMS server, you are done. The SUMS manager (production user) should continue with the next step.

There are two parts to setting up NetDRMS. First, the necessary services must be set up at the institution or group that will be hosting the NetDRMS service. The basic preparation and installation only needs to be done once, although the actual software distribution may be updated from time to time without affecting the setup. Second, individual users may wish to set up the NetDRMS software distribution for use or development in their own environment. Again, there are a few administrative tasks that need to be performed once when a user is registered, but the software may be updated or rebuilt at any time. Once the site preparation and setup is complete, user setup is a simple task, so there are two sets of instructions. Most users only need to concern themselves with the second, Installing / Upgrading NetDRMS.

Building and installing SUMS

To make the SUMS server available, follow steps below, or the SUMS manager (only) needs to run make sums in the DRMS root directory. This only needs to be done once for the system; individual users do not need to do it. At this point, if you are the SUMS manager, you are ready to proceed with the configuration, build and start of SUMS services. Proceed to the SUMS setup instructions. Otherwise you are ready to go. Please note that you will see many, many warning messages as NetDRMS and SUMS compile. Pages and pages of warnings will likely appear. Unless you have an error, you should be okay to proceed.

  1. Build the SUMS binaries:

    su - <production user>; cd $JSOCROOT; ./configure; make sums

  2. Copy the sum_chmown program to <path to sum_chmown> (chosen in step 1a. above), make the production user the owner, and give it setuid privileges:

    su - root
    cp $JSOCROOT/drms/_linux_x86_64/base/sums/apps/sum_chmown <path to sum_chmown>
    chown root:root <path to sum_chmown>
    chmod u+s <path to sum_chmown>

    Note: some sites have made this program into a program that does nothing when called. These sites have only one user that writes files to sums, however, and need not be concerned about different users with different permissions writing files to sums.

  3. Start SUMS:

    $JSOCROOT/base/sums/scripts/sum_start.NetDRMS

    The script does not return a prompt after echoing "sum_svc now available". Just hit RETURN.

  4. To stop SUMS for any reason, run this script:

    $JSOCROOT/base/sums/scripts/sum_stop.NetDRMS

Remote SUMS

A local NetDRMS may contain data produced by other, non-local NetDRMSs. Via a variety of means, the local NetDRMS can obtain and ingest the database information for these data series produced non-locally. In order to use the associated data files (typically image files), the local NetDRMS must download the storage units (SUs) associated with these data series too. There are currently two methods to facilitate these SU downloads. The Java Mirroring Daemon (JMD) is a tool that can be installed and configured to download SUs automatically as the series data records are ingested into the local NetDRMS. It fetches these SUs before they are actually used. It can obtain the SUs from any other NetDRMS that has the SUs, not just the NetDRMS that originally produced them. Remote SUMS is a built-in tool that comes with NetDRMS. It downloads SUs as needed - i.e., if a module or program requests the path to the SU or attempts to read it, and it is not present in the local SUMS yet, Remote SUMS will download the SUs. While the SUs are being downloaded, the initiating module or program will poll waiting for the download to complete.

Several components compose Remote SUMS. On the client side, the local NetDRMS, is a daemon that must be running (rsumsd.py). There also must exist some database tables, as well as some binaries used by the daemon. On the server side, all NetDRMS sites that wish to act as a source of SUs for the client, is a CGI (rs.sh). This CGI returns file-server information (hostname, port, user, SU paths, etc.) for the SUs the server has available in response to requests that contain a list of SUNUMs. When the client encounters requests for remote SUs that are not contained in the local SUMS, it requests the daemon to download those SUs. The client code then polls waiting for the request to be serviced. The daemon in turn sends requests to all rs.sh CGIs at all the relevant providing sites. The owning sites return the file-server information to the daemon, and then the daemon downloads the SUs the client has requested, via scp, and notifies the client module once the SUs are available for use. The client module will then exit from its polling code and continue to use the freshly downloaded SUs.

To use Remote SUMS, the config.local configuration file must first be configured properly, and NetDRMS must be re-built. Here are the relevant config.local parameters:

  • JMD_IS_INSTALLED - This must be set to 0 for Remote SUMS use. Currently, either the JMD or the Remote SUMS features can be used, but not both at the same time.
  • RS_REQUEST_TABLE - This is the database table used by the local module and the rsumsd.py daemon running at the local site for communicating SU-download requests. Upon encountering a non-native SUNUM, DRMS will insert a new record into this table to intiate a request for the SUNUM from the owning NetDRMS. The Remote SUMS daemon will service the request and update this record with results.
  • RS_SU_TABLE - This is the database table used by the Remote SUMS daemon to track SUs downloaded from the providing sites.
  • RS_DBHOST - This is the local database-server host that contains the database that contain the requests and SU tables.
  • RS_DBNAME - This is the database on the host that contains the requests and SU tables.
  • RS_DBPORT - This is the port on the local on which the database-server host accepts connections.
  • RS_DBUSER - This is the database user account that the Remote SUMS daemon uses to manage the Remote SUMS requests.
  • RS_LOCKFILE - This is the path to a file that ensures that only one Remote SUMS daemon instance runs.
  • RS_LOGDIR - This is the directory into which the Remote SUMS daemon logs are written.
  • RS_REQTIMEOUT - This is the timeout, in minutes, for a new SU request to be accepted for processing by the daemon. If the daemon encounters a request older than this value, it will reject the new request.
  • RS_DLTIMEOUT - This is the timeout, in minutes, for an SU to download. If the time the download takes exceeds this value, then all requests waiting for the SU to download will fail.
  • RS_MAXTHREADS - The maximum number of download threads that the Remote SUMS daemon is permitted to run simultaneously. One thread is one scp call.
  • RS_BINPATH - The NetDRMS-binary-path that contains the external programs needed by the Remote SUMS daemon (jsoc_fetch, vso_sum_alloc, vso_sum_put).

After setting-up config.local, you must build or re-build NetDRMS:

> cd $JSOCROOT
> configure
> make

It is important to ensure that three binaries needed by the Remote SUMS daemon have been built: jsoc_fetch, vso_sum_alloc, vso_sum_put.

Ensure that Python >= 2.7 is installed. You will need to install some package if they are not already installed: psycopg2, ...

An output log named rslog_YYYYMMDD.txt will be written to the directory identified by the RS_LOGDIR config.local parameter, so make sure that directory exists.

Provide all providing NetDRMS sites your public SSH key. They will need to put that key in their authorized_keys file.

Create the client-side Remote SUMS database tables. Run:

> $JSOCROOT/base/drms/scripts/rscreatetabs.py op=create tabs=req,su

Start the rsumsd.py daemon as the user specified by the RS_DBUSER config.local parameter. As this user, start an ssh-agent process and add the public key to it:

> ssh-agent -c > $HOME/.ssh-agent_rs
> source $HOME/.ssh-agent_rs
> ssh-add $HOME/.ssh/id_rsa

This will allow you to create a public-private key that has a passphrase while obviating the need to manually enter that passphrase when the Remote SUMS daemon runs scp.

Start SUMS:

> $JSOCROOT/base/sums/scripts/sum_start.NetDRMS >& <log dir>/sumsStart.log

Substitute your favorite log directory for <log dir>. There is another daemon, sums_procck.py, that keeps SUMS up and running once it is started. Redirecting to a log will preserve important information that this daemon prints. To stop SUMS, use $JSOCROOT/base/sums/scripts/sum_stop.NetDRMS.

Start the Remote SUMS daemon:

> $JSOCROOT/base/drms/scripts/rsumsd.py

JsocWiki: DRMSSetup (last edited 2024-01-19 09:08:03 by ArtAmezcua)