3. Basic Mappings
3.1. Notation
The X.400 protocols are encoded in a structured manner according to ASN.1, whereas RFC 822std11(-> 2822prop) is text encoded. To define a detailed mapping, it is necessary to refer to detailed protocol elements in each format. A notation to achieve this is described in this section.
3.1.1. RFC 822std11(-> 2822prop)
Structured text is defined according to the Extended Backus Naur Form (EBNF) defined in Section 2 of RFC 822std11(-> 2822prop) [16]. In the EBNF definitions used in this specification, the syntax rules given in Appendix D of RFC 822std11(-> 2822prop) are assumed. When these EBNF tokens are referred to outside an EBNF definition, they are identified by the string "822." appended to the beginning of the string (e.g., 822.addr-spec). Additional syntax rules, to be used throughout this specification, are defined in this chapter.
The EBNF is used in two ways.
1. To describe components of RFC 822std11(-> 2822prop) messages (or of SMTP components). When these new EBNF tokens are referred to outside an EBNF definition, they are identified by the string "EBNF." appended to the beginning of the string (e.g., EBNF.importance). 2. To describe the structure of IA5 or ASCII information not in an RFC 822std11(-> 2822prop) message.
For all new EBNF, tokens will either be self delimiting, or be delimited by self delimiting tokens. Comments and LWSP are not used as delimiters, except for the following cases, where LWSP may be inserted according to RFC 822std11(-> 2822prop) rules.
- Around the ":" in all headers
- EBNF.labelled-integer
- EBNF.object-identifier
- EBNF.encoded-info
RFC 822std11(-> 2822prop) folding rules are applied to all headers. Comments are never used in these new headers.
This notation is used in a modified form to refer to NOTARY EBNF [28]. For this EBNF, the keyword EBNF it replaces with DSN, for example DSN.final-recipient-field fields.
3.1.2. ASN.1
An element is referred to with the following syntax, defined in EBNF:
element = service "." definition *( "." definition )
service = "IPMS" / "MTS" / "MTA"
definition = identifier / context
identifier = ALPHA *< ALPHA or DIGIT or "-" >
context = "[" 1*DIGIT "]"
The EBNF.service keys are shorthand for the following service specifications:
IPMS IPMSInformationObjects defined in Annex E of X.420 / ISO 10021- 7. MTS MTSAbstractService defined in Section 9 of X.411 / ISO 10021-4. TA MTAAbstractService defined in Section 13 of X.411 / ISO 10021-4. FTBP File Transfer Body Part, as defined in [27].
The first EBNF.identifier identifies a type or value key in the context of the defined service specification. Subsequent EBNF.identifiers identify a value label or type in the context of the first identifier (SET or SEQUENCE). EBNF.context indicates a context tag, and is used where there is no label or type to uniquely identify a component. The special EBNF.identifier keyword "value" is used to denote an element of a sequence. For example, IPMS.Heading.subject defines the subject element of the IPMS heading. The same syntax is also used to refer to element values. For example, MTS.EncodedInformationTypes.[0].g3Fax refers to a value of MTS.EncodedInformationTypes.[0] .
3.2. ASCII and IA5
A gateway will interpret all IA5 as ASCII. Thus, mapping between these forms is conceptual.
3.3. Standard Types
There is a need to convert between ASCII text and some of the types defined in ASN.1 [14]. For each case, an EBNF syntax definition is given, for use in all of this specification, which leads to a mapping between ASN.1, and an EBNF construct. All EBNF syntax definitions of ASN.1 types are in lower case, whereas ASN.1 types are referred to with the first letter in upper case. Except as noted, all mappings are symmetrical.
3.3.1. Boolean
Boolean is encoded as:
boolean = "TRUE" / "FALSE"
3.3.2. NumericString
NumericString is encoded as:
numericstring = *(DIGIT / " ")
3.3.3. PrintableString
PrintableString is a restricted IA5String defined as:
printablestring = *( ps-char )
ps-restricted-char = 1DIGIT / 1ALPHA / " " / "'" / "+"
/ "," / "-" / "." / "/" / ":" / "=" / "?"
ps-delim = "(" / ")"
ps-char = ps-delim / ps-restricted-char
This can be used to represent real printable strings in EBNF.
3.3.4. T.61String
In cases where T.61 strings are only used for conveying human interpreted information, the aim of a mapping is to render the characters appropriately in the remote character set, rather than to maximise reversibility. For these cases, there are two options, both of which are conformant to this specification: 1. The mappings to IA5 defined in ITU-T Recommendation X.408 (1988) may be used [13]. These will then be encoded in ASCII. This is the approach mandated in RFC 1327(-> 2156prop). 2. This mapping may be used if the characters are not contained within ASCII repertoire, but are all in an IANA-registered character set. Use the encoding defined in RFC 1522(-> 2049draft | 2048(-> 4289 | 4288) | 2047draft | 2046draft | 2045draft) [9] to generate appropriate encoded-words. If this mapping is used, the character set ISO-8859-1 shall be used if all of the characters needed are available in this repertoire. In other cases, the character set TELETEX shall be used. The details of this character set is defined in the Appendix C of RFC 2157prop. There is also a need to represent Teletex Strings in ASCII, for some aspects of OR Address. For these, the following encoding is used: teletex-string = *( ps-char / t61-encoded ) t61-encoded = "{" 1* t61-encoded-char "}" t61-encoded-char = 3DIGIT Characters in EBNF.ps-char are mapped simply. Other octets, including control characters, are mapped using a quoting mechanism similar to the printable string mechanism. Each octet is represented as 3 decimal digits. For example, the Yen character (hex A5) is represented as {165}. As the three character string, a, yen character, b, would be represented as either "a{165}b". The use of escape sequences follows that set down for ASN1. in ISO 8825-1, with the additional specifiction that the default G1 page is ISO Latin 1. The page settings may be changed by escape sequences. Changes of the settings hold within a pair of curly brackets ({}), and the settings revert to the default after the right bracket (}) (i.e., they do not carry forward to subsequent T.61 encoding). There are a number of places where a string may have a Teletex and/or Printable String representation. The following EBNF is used to represent this. teletex-and-or-ps = [ printablestring ] [ "*" teletex-string ] The natural mapping is restricted to EBNF.ps-char, in order to make the full BNF easier to parse. An example is: "yen*{165}"
3.3.5. UTCTime
Both UTCTime and the RFC 822std11(-> 2822prop) 822.date-time syntax contain: Year, Month, Day of Month, hour, minute, second (optional), and Timezone (technically a time differential in UTCTime). 822.date-time also contains an optional day of the week, but this is redundant. With the exception of Year, a symmetrical mapping can be made between these constructs. Note: In practice, a gateway will need to parse various illegal variants on 822.date-time. In cases where 822.date-time cannot be parsed, it is recommended that the derived UTCTime is set to the value at the time of translation. Such errors may be noted in an RFC 822std11(-> 2822prop) comment, to aid detection and correction. When mapping to X.400, the UTCTime format which specifies the timezone offset shall be used. When mapping to RFC 822std11(-> 2822prop), the 822.date-time format shall include a numeric timezone offset (e.g., -0500). When mapping time values, the timezone shall be preserved as specified. The date shall not be normalised to any other timezone. RFC 822std11(-> 2822prop), as modified by RFC 1123std3, requires use of a four digit year. Note that the original RFC 822std11(-> 2822prop) uses a two digit date, which is no longer legal. UTCTime uses a two digit date. To map a year from RFC 822std11(-> 2822prop) to X.400, simply use the last two digits. To map a year from X.400 to RFC 822std11(-> 2822prop), assume that the two digit year refers to a year in the 10 year epoch 1980-2079.
3.3.6. Integer
A basic ASN.1 Integer will be mapped onto EBNF.numericstring. In many cases ASN.1 will enumerate Integer values or use ENUMERATED. An EBNF encoding labelled-integer is provided. When mapping from EBNF to ASN.1, only the integer value is mapped, and the associated text is discarded. When mapping from ASN.1 to EBNF, a text label may be added. It is recommended that this is done wherever possible and that clear text labels are chosen. A second encoding labelled-integer-2 is provided. This is used in DSNs, where the parsing rules will treat the text as a comment. This definition was not present in RFC 1327(-> 2156prop). labelled-integer ::= [ key-string ] "(" numericstring ")" labelled-integer-2 ::= [ numericstring ] "(" key-string ")" key-string = *key-char key-char = <a-z, A-Z, 0-9, and "-">
3.3.7. Object Identifier
Object identifiers are represented in a form similar to that given in ASN.1. The order is the same as for ASN.1 (big-endian). The numbers are mandatory, and used when mapping from the ASCII to ASN.1. The key-strings are optional. It is recommended that as many strings as possible are generated when mapping from ASN.1 to ASCII, to facilitate user recognition. object-identifier ::= oid-comp object-identifier | oid-comp oid-comp ::= [ key-string ] "(" numericstring ")" An example representation of an object identifier is: joint-iso-ccitt(2) mhs (6) ipms (1) ep (11) ia5-text (0) or (2) (6) (1)(11)(0) Because of the use of brackets and the conflict with the RFC 822std11(-> 2822prop) comment convention, MIXER is defines so that the EBNFobject- identifier definition is not used in structured fields.
3.4. Encoding ASCII in Printable String
Some information in RFC 822std11(-> 2822prop) is represented in ASCII, and needs to be mapped into X.400 elements encoded as printable string. For this reason, a mechanism to represent ASCII encoded as PrintableString is needed.
A structured subset of EBNF.printablestring is now defined. This shall be used to encode ASCII in the PrintableString character set.
ps-encoded = *( ps-restricted-char / ps-encoded-char )
ps-encoded-char = "(a)" ; (@)
/ "(p)" ; (%)
/ "(b)" ; (!)
/ "(q)" ; (")
/ "(u)" ; (_)
/ "(l)" ; "("
/ "(r)" ; ")"
/ "(" 3DIGIT ")"
The 822.3DIGIT in EBNF.ps-encoded-char shall have range 0-127, and is interpreted in decimal as the corresponding ASCII character. Special encodings are given for: at sign (@), percent (%), exclamation mark/bang (!), double quote ("), underscore (_), left bracket ((), and right bracket ()). These characters, with the exception of round brackets, are not included in PrintableString, but are common in RFC 822std11(-> 2822prop) addresses. The abbreviations will ease specification of RFC 822std11(-> 2822prop) addresses from an X.400 system. These special encodings shall be interpreted in a case insensitive manner, but always generated in lower case.
A reversible mapping between PrintableString and ASCII can now be defined. The reversibility means that some values of printable string (containing round braces) cannot be generated from ASCII. Therefore, this mapping shall only be used in cases where the printable strings have been derived from ASCII (and will therefore have a restricted domain). For example, in this specification, it is only applied to a Domain Defined Attribute which will have been generated by use of this specification and a value such as "(" would not be possible.
To encode ASCII as PrintableString, the EBNF.ps-encoded syntax is used, with all EBNF.ps-restricted-char mapped directly. All other 822.CHAR are encoded as EBNF.ps-encoded-char.
To encode PrintableString as ASCII, parse PrintableString as EBNF.ps-encoded, and then reverse the previous mapping. If the PrintableString cannot be parsed, then the mapping is being applied in to an inappropriate value, and an error shall be given to the procedure doing the mapping. In some cases, it may be preferable to pass the printable string through unaltered.
Some examples are now given. Note the arrows which indicate asymmetrical mappings:
PrintableString ASCII
'a demo.' <-> 'a demo.'
foo(a)bar <-> foo@bar
(q)(u)(p)(q) <-> "_%"
(a) <-> @
(A) -> @
(l)a(r) <-> (a)
(126) <-> ~
( -> (
(l) <-> (
3.5. RFC 1522
RFC 1522(-> 2049draft | 2048(-> 4289 | 4288) | 2047draft | 2046draft | 2045draft) defines a mechanism for encoding other character set information into elements of RFC 822std11(-> 2822prop) Headers. A gateway may ignore this encoding and treat the elements as ASCII.
A preferred approach is for the gateway to interpret the RFC 1522(-> 2049draft | 2048(-> 4289 | 4288) | 2047draft | 2046draft | 2045draft) encoding. This will not always be straightforward, because:
1. RFC 1522(-> 2049draft | 2048(-> 4289 | 4288) | 2047draft | 2046draft | 2045draft) permits an openly extensible character set choice, which may be broader than T.61. 2. It is not always possible to map all characters into the equivalent X.400 field.
RFC 1522(-> 2049draft | 2048(-> 4289 | 4288) | 2047draft | 2046draft | 2045draft) is only applied to fields which are "for information only". A gateway which interprets header elements according to RFC 1522(-> 2049draft | 2048(-> 4289 | 4288) | 2047draft | 2046draft | 2045draft) may apply reasonable heuristics to minimise information loss.
