JSOC keywords notes – leading to a future document(s).
JSOC Keywords used for metadata
The DRMS system is based on data organized as records containing keyword tagged metadata and data arrays stored in named “segments”. To be useful the set of commonly used keywords must be agreed to by the majority of users and should be easily recognized by researchers familiar with SOHO/MDI and IDL SolarSoft (SSW) data formats. Both MDI and SSW are based on the FITS standard for external file formats.
Naming Syntax Issues
The space of names for keywords is constrained by implementation details with various heritages. In some language bindings the internal name seen by the programmer differs from the names actually stored in the data files. (E.g. the FITS “DATE-OBS” keyword is “DATE-OBS” in a file but is referenced as “DATE_d$OBS” in SolarSoft IDL code.) Similarly DRMS supports a limited set of characters to be used in keyword names since those names are also column names in PostgreSQL database tables. The common FITS standard for data exchange has limits on characters (upper case letters, digits, minus-sign “-“, and underscore “_”) and word length (8 characters, blank filled) starting left adjusted on a line which are based on the 80 column punched card heritage of the FITS standard. Some language bindings for FITS remove trailing blanks in another example of internal and external representation differences. DRMS limits keywords to 254 characters in length but also limits them to ASCII letters, digits, and the underscore ‘_’ character. In DRMS case is preserved but ignored. Names must begin with a letter.
Since DRMS names are not allowed to have a “-“, and some of the FITS Standard keywords do require a “-“, it is clear that the internal and external names for at least some keywords must differ. (This is similar to the IDL problem with “-“). Additionally some commonly used lists of FITS keywords are inconsistent with each other in the choice of keyword names for the same quantity and some even are inconsistent for the meaning of keywords with the same name. Some of the inconsistencies are due to poorly specified usage guidelines. An example is again “DATE-OBS” where MDI uses it to identify the time at which the image represents the measured quantity and SSW uses it to represent the start of the interval over which the observation was gathered. However, for spatial information both MDI and SSW use keywords which reference the center of the tagged location, i.e. the center of a pixel vs. the lower left corner of a pixel recognizing that a pixel has a finite extent in image space. Given these issues we must be careful to define the working set of internal DRMS keywords and the required mapping to FITS keywords for export to SSW and other “consumers”. Export to MDI should not be an issue since we will eventually port MDI analysis over to DRMS. But we expect to export data as FITS files to users of GONG based programs as well as SSW, so we need to be careful.
DRMS and FITS
In addition to keyword issues, there are other concerns that must be
addressed when exporting DRMS data into FITS files. The basic DRMS record structure is certainly
a subset of the rich FITS format but there are some limitations in addition to
mapping 254 char keyword names to 8 chars.
Primarily the issues concern links which are supported by DRMS. A DRMS record may contain none or more links
to other DRMS records. Furthermore,
keywords and segments in the linked records may be referenced from the record
with the link as if they were in that record rather than in the link target
record. On export keyword links will be
followed but the record links themselves will vanish. Links may be static or dynamic. A static link points to a specific unchanging
DRMS record. A dynamic link points to
DRMS record with a particular prime index value. If the link target record is updated, the
dynamic link will point to the most recent version. Upon export, dynamically linked keywords get
the values valid at export time. A DRMS
record may contain multiple data arrays with varying dimension. If we restrict our use of FITS exports to
Simple FITS files we will need to decide how to handle multiple array
records. Some DRMS records will contain
no array data but will have a record per time step. Such dataseries could reasonably be exported
as FITS tables. We will need to decide
if this will be implemented. The MDI
FITS reader does not support FITS tables.
If FITS tables are commonly used in SSW code we should be able to
support them as an export product.
Multiple images/arrays per record can also be handled in FITS as
extensions. As a minimum, we should be
able to export SSW compliant simple images as single FITS files.
Image Coordinate Mapping Keywords
We plan to use FITS standard keywords to describe the mapping between pixel-space and physical space, the so called “world coordinate system” or WCS. In WCS each physical coordinate axis is described by a set of keywords specifying the type of coordinate, physical units, and mapping onto array elements assumed to be image pixels. There is a design weakness in the FITS paper-1 coordinate mapping rules for mapping telescope images into locations in the sky and then onto the Sun. The essence of the issue has to do with an assumed known mapping between pixel addresses and arc-seconds. The FITS WCS Paper (Greisen and Calabretta, A&A 395, 1061-1075, 2002) definitions of CRPIXj, CRVALi, and CTYPEi combined with one of CROTAj and CDELTi; PCi_js and CDELTi; or just CDi_js where in the common usage of e.g. CTYPE1=SOLARX imply that the mapping to arc-seconds is well known. (These keywords are described below). This is seldom the case. For some instruments and analyses it may be that a crude estimate of this relationship is sufficient. For helioseismology is it not sufficient. A priori estimates of the plate scale are also not sufficient for comparing pixel-sized solar features between instruments. The correct mapping between a pixel address and a location in the Sun’s atmosphere is very complex even for in instrument where the light originates from a volume with thickness small compared to a pixel’s horizontal scale. For EUV imaging of optically thin regions it has an additional uncertainty of height. What one would ideally want is a mapping of pixel brightness to a solar angular coordinate (e.g. latitude and longitude in some well defined coordinate system) and distance from the Sun’s center, or height above some well defined surface.
It is however likely that for many users who are interested primarily in morphology that a rough approximation of the telescope image scale and distortions is sufficient for quick-look use. We propose to allow increasing accuracy of the mapping keywords as the data progresses from as-observed telemetry data to higher levels of analysis. In this process we will explicitly specify what the approximations are and which keyword contains measured information and which contain derived quantities. This will involve maintaining some “legacy” keyword names that have been developed to accurately describe the data at lower levels of processing and transitioning into the WCS keywords as the preferred usage at higher levels of calibration.
We propose the following scenario:
AIA: a FIXED value for the telescope plate scale for each telescope is assumed. This value will be stored in the keyword IM_SCALE. At the first opportunity (level-0.3) CTYPE will be set to SOLARX and SOLARY, with CROTA1 == CROTA2 to be logically applied before assigning the implied labels of X and Y used to map array index 1 and 2. The “SOLARY” direction will be the projection of the Carrington solar rotation axis onto the plane of the sky (+ is north) and “SOLARX” is perpendicular to that also in the plane of the sky, (+ west on Sun which is roughly in the direction of Earth orbit motion). Once this is done XCEN and YCEN can be computed.
HMI: a FIXED value for the radius of the Sun in meters combined with a measured average radius (pixels) of the solar image using a non-changing definition of the solar limb, combined with the known distance between the telescope and the solar center (not photosphere). Here the keywords “R_SUN”, “X0”, and “Y0” will contain the key information from which the other values are computed. R_SUN, X0, Y0 are all in pixels with center of the lower left pixel of the array set to 0.0, 0.0. X0 and Y0 are the location of the solar disk center in the image as is. Then we probably set CRPIX1 == X0+1, CRPIX2 == Y0+1.
Then in order:
From SDO attitude and orbit information:
SAT_ROT = for lev<0.5, intended/commanded roll such that SAT_ROT is degrees of rotation of the image of the Sun’s pole projection onto the CCD with Sun’s N pole CW from the “y” CCD axis for positive SAT_ROT. I.e. SAT_ROT is positive for a CCW roll of SDO when viewed from behind SDO looking toward the Sun. (This convention must be verified with SDO FDS conventions.). For lev>=0.5 this will be corrected with data from SDO attitude data.
DSUN_OBS distance to Sun center from spacecraft in m, c. 1.5E11.
RSUN_REF = radius of Sun in m, agreed upon standard, c. 6.96E8 but must be consistent with WAVELNTH keyword.
WAVELNTH wavelength (nm) of observation. For HMI use 617.3.
For each CCD camera:
INST_ROT =
Telescope roll angle. The angle between the instrument CCD Y axis and the SDO Z
axis. This is a calibration (nearly
constant) determined for each AIA ATA or HMI camera. The sign convention should be the same as for
SAT_ROT after any required image flipping to allow solar west to be to the
right when solar north is up.
Now, starting at level-0
we can provide the following:
For AIA:
X0, Y0 = computed from commanded pointing and known offsets, pixel address of the science reference boresite with telescope specific corrections.
IM_SCALE Predefined AIA plate scale in arc-seconds per pixel.
R_SUN computed from DSUN_OBS, RSUN_REF, and IM_SCALE.
For HMI:
X0, Y0 for lev<0.8 computed from commanded pointing and known offsets, pixel address of the science reference boresite with telescope specific corrections; for lev >= 0.8 computed from fit to image
R_SUN for lev<0.8 computed from IM_SCALE, DSUN_OBS and RSUN_REF; for lev>=0.8 fit to image.
IM_SCALE for lev<0.8 set to ~ 0.5; for lev >=0.8 computed from R_SUN and DSUN_OBS and RSUN_REF.
For both AIA and HMI for lev0.3 and above:
CDELT1 == CDELT2 = IM_SCALE if full resolution, else scaled from IM_SCALE.
CROTA1 == CROTA2 = SDO_ROLL with corrections level 0.5 and above.
CRPIX1, CRPIX2 = X0, Y0
CRVAL1 = CRVAL2 = 0.0,0.0
XCEN,
YCEN computed from above.
Figure
1: FITS WCS Coordinate Mapping Sequence from
WSC paper. Shows the
relations between the world coordinates indexed by i and the array coordinates indexed by j. |
General Descriptive Keywords in DRMS and for FITS
Exports
There are a number of general purpose keywords that are FITS standard reserved keywords that can be adopted as DRMS keywords. These are NOT all present at all levels of processing but will only be added when they can be properly determined. They will not be carried to higher levels when they are no longer applicable. These fall into several sets:
These Cxxxxx
keywords may have multiple sets present.
If so the sets beyond the first have a single letter suffix indicating
the set. Additionally a WCSNAMEa keyword should be added to identify the set. E.g. if a single second set is present then
the additional keywords will be: WCSNAMEA,
CTYPE1A, CTRYP2A, CRPIX1A, CRPIX2A, CRVAL1A, CRVAL2A, etc.
a = CROTA2; XCEN = CRVAL1 + CDELT1*cos(a)*((NAXIS1+1)/2 - CRPIX1) -CDELT2*sin(a)*((NAXIS2+1)/2
- CRPIX2)
a = CROTA2; YCEN = CRVAL2 +
CDELT1*sin(a)*((NAXIS1+1)/2 - CRPIX1) + CDELT2*cos(a)*((NAXIS2+1)/2
- CRPIX2)
Coordinate Mappings – CTYPEs
The list of CTYPE axes is at least:
Coordinate Mappings for level 0.3 and above for
HMI and AIA
The FITS WCS standard allows multiple coordinate specifications to be present. This would allow e.g. both a SOLARX/Y and HPLN-TAN/HPLT-TAN specification to be given. In this case CTYPE1A would be HPLN-TAN, etc. Therefore we recommend that two sets be used for level 0.3 and above. The unadorned first set (i.e. no suffix letter) will be SOLARX and SOLARY and the second set, e.g. CTYPE1A and CTYPE2A will be HPLN-TAN and HPLT-TAN respectively. Also then we should set WCSNAME = ‘solarxy’ and WCSNAMEA = ‘Helioprojective-cartesian’
Conversion from MDI conventions for the keywords
used be e.g. v2helio:
MDI |
JSOC |
NOTE |
T_OBS |
T_OBS |
|
OBS_B0 |
CRLT_OBS |
STEREO uses HGLT_OBS |
OBS_L0 |
CRLN_OBS |
STEREO uses HGLN_OBS |
SOLAR_P |
-(SAT_ROT+INST_ROT) |
Note sign difference. |
IM_SCALE |
IM_SCALE |
Note this is CCD plate scale |
S_MAJOR |
n.a. |
Will be fixed in lev1, use 1.0 |
S_MINOR |
n.a. |
Will be fixed in lev1, use 1.0 |
S_ANGLE |
n.a. |
Will be fixed in lev1, use 0.0 |
X_SCALE |
CDELT1 / IM_SCALE |
Rebin info, CCD pixels per
bin. Implied from CDELT and
IM_SCALE |
Y_SCALE |
CDELT2 / IM_SCALE |
|
DATASIGN |
DATASIGN |
|
OBS_VR |
OBS_VR |
|
OBS_VW |
OBS_VW |
|
OBS_VN |
OBS_VN |
|
ORIENT |
|
‘SESW’ normal, should be set to SESW in lev0 processing. |
OBS_DIST |
DSUN_OBS |
|
XSCALE |
|
|
X0 |
X0 |
|
Y0 |
Y0 |
|
BFITZERO |
n.a., use 0.0 |
If BFITZERO in non-zero the data should be corrected on input. |
|
|
|
Misc NOTES
SSW uses CTYPE= degrees-latitude and degrees-longitude for lat/lon heliographic cords in remapped images.
SSW uses solar_x and solar_y
and sometimes solar-x and solar-y
Some documents show solar-x and solar-y instead of SOLARX and SOLARY.
And add a second set, e.g. CTYPE1A = HPLN-TAN with CDELT1A in degrees, etc. for the export fixup.
The FITS Coordinate Systems for Solar Image Data WCS paper in the Helioprojective-cartesian mapping uses HPLN-TAN and HPLT-TAN with CDELTi in degrees instead of SOLARX and SOLARY with CDELT in arc-seconds.
XCEN and YCEN are a problem. They are defined in the SolarSoft
documentation as:
|
Suggested |
De Facto Solar Convention |
|
|
Suggested |
De Facto Solar Convention |
|
But CRVALi only exist
AFTER and CROTA have been applied so the definitions are not consistent. In particular when CROTA is not zero XCEN as
built above has no meaning. A correct
definition is probably:
a = CROTA2
XCEN = CRVAL1 + CDELT1*cos(a)*((NAXIS1+1)/2 -
CRPIX1)
- CDELT2*sin(a)*((NAXIS2+1)/2 - CRPIX2)
YCEN = CRVAL2 +
CDELT1*sin(a)*((NAXIS1+1)/2 - CRPIX1)
+ CDELT2*cos(a)*((NAXIS2+1)/2 - CRPIX2)
LONPOLE, LATPOLE, WCSNAMEa
DSUN_OBS instead of D_SUN
HGLN_OBS =0 and HGLT_OBS=B0
Or CRLN_OBS is Carrington longitude of disk
center.
STEREO uses:
CARX_OBS
[7] Carrington X coordinate of
observer
CARY_OBS
[7] Carrington X coordinate of
observer
CARZ_OBS
[7] Carrington Z coordinate of
observer
CAR_ROT
[5] Carrington rotation number
OR CRLN_OBS and CRLT_OBS
RSUN_REF
[7] Value of Rsun (meters) used in determining coordinates
ORIGIN
[1] Responsible organization
or institution
STEREO and AIA ask for:
OBJECT Name of object
OBJ_ID Object
identifier, e.g. active region number
OBSERVER Name of observer
OBS_PROG Name of the
observing program
SCI_OBJ The science objective of the observation
Appendix A. SDO Coordinate Definitions
Appendix B:
FITS to DRMS to FITS keyword mapping.
We have had a number of suggestions for maintaining maps of keywords
to/from various external standards.
Viewed from the DRMS coder we could define a pair of routines that can
do the mappings and simply use them.
They can start with a simple rule and expand to do more complex mappings
if and when they are implemented.
As a concrete suggestion, the following is offered as a merging of
suggestions by Art and Phil. This section will be removed or replaced in a future version of this
document.
i.
Look in the comment field of the
DRMS keyname to see if the first “word” is “[name]” and if so, use name.
ii.
If the name is longer than 8 chars
and contains a double, triple, or longer sequence of underscores (“__”) replace
it with a single hyphen (“-“) and proceed to the next step.
iii.
Name already valid FITS name,
convert to upper case and use.
iv.
Name too long, If the DRMS name is unique if
truncated at 8 chars, use the truncated form.
If still no solution, truncate at 7 chars and add a digit starting at 0
and increasing until a unique name is found.
If that fails, truncate at 6 chars and use a 2 digit number, etc.
i.
In the case that the series
already exists then before the rule below is implemented to make the mapping
table, the comment will be examined for the external name and if found, that
name will be returned.
ii.
In the case that the series does
not yet exist this function can be used to produce a
default mapping. Since most all FITS
names are OK DRMS names except for the “-“ char the default mappings need only
deal with this case. The conversion will
be to convert a “-“ to a sequence of “_” long enough to make the name longer
than 8 chars or to 2 underscores if the name is already longer than 8 chars in
case there are two or more hyphens in the name.
It is expected that the new mapping generated this way will be added to
the keyword comment as the initial word in square brackets when a series is
made.