#format plain /home/production/cvs/JSOC/doc/whattodo_sum_partn_on_off.txt If a file server is down, it's /SUM partitions should be taken offline so that SUMS does not try to allocate storage from them and then hang on the mkdir that it is unable to do. Here are the /SUM partitions on each file server: d02: /SUM0 thru /SUM20s d03: /SUM30s d04: /SUM40s The way to take a /SUM partition offline is to set its pds_set_num = -1. This is done in the sum_partn_avail db table which looks like: partn_name | total_bytes | avail_bytes | pds_set_num | pds_set_prime ------------+----------------+---------------+-------------+--------------- /SUM37 | 25000000000000 | 1349566595072 | 0 | 0 /SUM4 | 33000000000000 | 1600150855680 | 0 | 0 /SUM2 | 33000000000000 | 1599634276352 | 0 | 0 /SUM20 | 33000000000000 | 1599828013056 | 0 | 0 /SUM21 | 33000000000000 | 1600223768576 | 0 | 0 /SUM22 | 33000000000000 | 1599709511680 | 0 | 0 /SUM3 | 33000000000000 | 1599469428736 | 0 | 0 [etc.] Do this as user production on a machine with psql (e.g. n02) and adjust for which file server (d03) is to be taken offline: > psql -h hmidb -p 5434 jsoc_sums jsoc_sums=> select * from sum_partn_avail where partn_name like '/SUM3%'; partn_name | total_bytes | avail_bytes | pds_set_num | pds_set_prime ------------+----------------+---------------+-------------+--------------- /SUM3 | 33000000000000 | 1599825543168 | 0 | 0 /SUM30 | 22000000000000 | 1199689433088 | 0 | 0 /SUM31 | 22000000000000 | 1199430434816 | 0 | 0 /SUM32 | 22000000000000 | 747345281024 | 0 | 0 /SUM33 | 22000000000000 | 786056609792 | 0 | 0 /SUM34 | 25000000000000 | 1349826641920 | 0 | 0 /SUM35 | 25000000000000 | 1349841321984 | 0 | 0 /SUM36 | 25000000000000 | 1349845516288 | 0 | 0 /SUM37 | 25000000000000 | 1349560303616 | 0 | 0 (9 rows) jsoc_sums=> update sum_partn_avail set pds_set_num=-1 where jsoc_sums-> partn_name in ('/SUM30', '/SUM31', '/SUM32', '/SUM33', jsoc-sums(> '/SUM34', '/SUM35', '/SUM36', '/SUM37'); [Notice that we did not set '/SUM3' which is not on the d03 fileserver.] jsoc_sums=> \q Now force all the sum_svc processes to reread this new sum_partn_avail table. > sumrepartn When d03 is back on the air, rerun the update command shown above with pds_set_num=0 and do again: > sumrepartn