RFC 2045:Multipurpose Internet Mail Extensions ...
RFC-Ref

encoding


Click on the red underlined text to get to the source

... User Agent, a program with which human users send and receive mail). Examples of such encodings currently used in the Internet include pure hexadecimal, uuencode, the 3-in-4 base 64 scheme specified in ...
... A Content-Transfer-Encoding header field, which can be used to specify both the encoding ...
... Content-Transfer-Encoding header field, which can be used to specify both the encoding transformation that was applied to the body and the domain of the result. ...
... was applied to the body and the domain of the result. Encoding transformations other than the identity transformation are usually applied to data in order to ...


... This definition is intended to allow various kinds of character encodings, from simple single-table mappings such as US-ASCII to complex table switching methods ...
... character sets and switching techniques make the situation more complex. For example, some communities use the term "character encoding" for what MIME calls a "character set", while ...


... encoding ...


... Content-Transfer-Encoding Header Field ...
... It is necessary, therefore, to define a standard mechanism for encoding such data into a 7bit short line format. Proper labelling of unencoded material in less restrictive formats for direct use over less restrictive transports ...
... less restrictive transports is also desireable. This document specifies that such encodings will be indicated by a new "Content- Transfer-Encoding" header field ...
... specifies that such encodings will be indicated by a new "Content- Transfer-Encoding" header field. This field has not been defined by any previous standard. ...
... Content-Transfer-Encoding Syntax ...
... The Content-Transfer-Encoding field's value is a single token specifying the type of encoding ...
... Content-Transfer-Encoding field's value is a single token specifying the type of encoding, as enumerated below. Formally: ...
... "Content-Transfer-Encoding" ":" mechanism ...
... BASE64 and bAsE64 are all equivalent. An encoding type of 7BIT requires that the body is already in a 7bit mail-ready representation. This is the default value -- that is, "Content-Transfer-Encoding ...
... encoding type of 7BIT requires that the body is already in a 7bit mail-ready representation. This is the default value -- that is, "Content-Transfer-Encoding: 7BIT" is assumed if the Content-Transfer-Encoding header field ...
... default value -- that is, "Content-Transfer-Encoding: 7BIT" is assumed if the Content-Transfer-Encoding header field is not present. ...
... Content-Transfer-Encodings Semantics ...
... This single Content-Transfer-Encoding token actually provides two pieces of information. It specifies what sort of encoding ...
... Content-Transfer-Encoding token actually provides two pieces of information. It specifies what sort of encoding transformation the body was subjected to and hence what decoding operation must be used to restore it to its original form, and it ...
... The transformation part of any Content-Transfer-Encodings specifies, either explicitly or implicitly, a single, well-defined decoding ...
... it to the original sequence of octets which was encoded, or shows that it is illegal as an encoded sequence. Content-Transfer- Encodings transformations never depend on any additional external profile information for proper operation. Note that while decoders ...
... must produce a single, well-defined output for a valid encoding no such restrictions exist for encoders: Encoding ...
... encoding no such restrictions exist for encoders: Encoding a given sequence of octets to different, equivalent encoded sequences is perfectly legal. ...
... Three transformations are currently defined: identity, the "quoted- printable" encoding, and the "base64" encoding. The domains ...
... printable" encoding, and the "base64" encoding. The domains are "binary", "8bit ...
... The Content-Transfer-Encoding values "7bit", "8bit", and "binary" all mean that the identity ...
... mean that the identity (i.e. NO) encoding transformation has been performed. As such, they serve simply as indicators of the domain of ...
... domain of the body data, and provide useful information about the sort of encoding that might be needed for transmission in a given transport system. The terms "7bit data", "8bit ...
... The quoted-printable and base64 encodings transform their input from an arbitrary domain into material in the "7bit" range ...
... The proper Content-Transfer-Encoding label must always be used. Labelling unencoded data containing 8bit characters as "7bit" is not ...
... Unlike media subtypes, a proliferation of Content-Transfer-Encoding values is both undesirable and unnecessary. However, establishing only a single transformation into the "7bit" domain ...
... possible. There is a tradeoff between the desire for a compact and efficient encoding of largely- binary data and the desire for a somewhat readable encoding of data ...
... efficient encoding of largely- binary data and the desire for a somewhat readable encoding of data that is mostly, but not entirely, 7bit. For this reason, at least two encoding mechanisms are ...
... somewhat readable encoding of data that is mostly, but not entirely, 7bit. For this reason, at least two encoding mechanisms are necessary: a more or less readable encoding (quoted-printable ...
... 7bit. For this reason, at least two encoding mechanisms are necessary: a more or less readable encoding (quoted-printable) and a "dense" or "uniform" encoding ...
... encoding (quoted-printable) and a "dense" or "uniform" encoding (base64). ...
... unencoded binary data in mail bodies. Thus there are no circumstances in which the "binary" Content-Transfer-Encoding is actually valid in Internet mail ...
... NOTE: The five values defined for the Content-Transfer-Encoding field imply nothing about the media type other than the algorithm ...
... New Content-Transfer-Encodings ...
... Implementors may, if necessary, define private Content-Transfer- Encoding values, but must use an x-token, which is a name prefixed by "X-", to indicate its non-standard status, e.g., "Content-Transfer- ...
... token, which is a name prefixed by "X-", to indicate its non-standard status, e.g., "Content-Transfer- Encoding: x-my-new-encoding". Additional standardized Content- Transfer-Encoding ...
... "X-", to indicate its non-standard status, e.g., "Content-Transfer- Encoding: x-my-new-encoding". Additional standardized Content- Transfer-Encoding values must be specified by a standards-track RFC ...
... Encoding: x-my-new-encoding". Additional standardized Content- Transfer-Encoding values must be specified by a standards-track RFC. The requirements ...
... requirements such specifications must meet are given in RFC 2048(-> 4289 | 4288). As such, all content-transfer-encoding namespace except that beginning with "X-" is explicitly reserved to the IETF ...
... Unlike media types and subtypes, the creation of new Content- Transfer-Encoding values is STRONGLY discouraged, as it seems likely to hinder interoperability with little potential benefit ...
... If a Content-Transfer-Encoding header field appears as part of a message header ...
... message header, it applies to the entire body of that message. If a Content-Transfer-Encoding header field appears as part of an entity's ...
... entity. If an entity is of type "multipart" the Content-Transfer-Encoding is not permitted to have any value other than "7bit", "8bit" or "binary". Even more ...
... octets rather than bits, so that the mechanisms described here are mechanisms for encoding arbitrary octet streams, not bit streams. If a bit stream ...
... The encoding mechanisms defined here explicitly encode all data in US-ASCII. Thus, for example, suppose an entity ...
... charset=ISO-8859-1 Content-transfer-encoding: base64 ...
... base64 US-ASCII encoding of data that was originally in ISO-8859-1, and will be in that character set ...
... Certain Content-Transfer-Encoding values may only be used on certain media types. In particular, it is EXPRESSLY FORBIDDEN to use any ...
... media types. In particular, it is EXPRESSLY FORBIDDEN to use any encodings other than "7bit", "8bit", or "binary" with any composite ...
... composite media types are "multipart" and "message". All encodings that are desired for bodies of type multipart or message must be done at the innermost level, by encoding ...
... "message". All encodings that are desired for bodies of type multipart or message must be done at the innermost level, by encoding the actual body that needs to be encoded. ...
... composite entity has a transfer-encoding value such as "7bit", but one of the enclosed entities has a less restrictive value such as "8bit", then either the ...
... NOTE ON ENCODING RESTRICTIONS: Though the prohibition against using content-transfer-encodings on composite ...
... ON ENCODING RESTRICTIONS: Though the prohibition against using content-transfer-encodings on composite body data may seem overly restrictive, it is necessary to prevent nested encodings ...
... content-transfer-encodings on composite body data may seem overly restrictive, it is necessary to prevent nested encodings, in which data are passed through an encoding algorithm ...
... restrictive, it is necessary to prevent nested encodings, in which data are passed through an encoding algorithm multiple times, and must be decoded multiple times in order to be properly viewed. ...
... algorithm multiple times, and must be decoded multiple times in order to be properly viewed. Nested encodings add considerable complexity to user agents: Aside from the obvious efficiency problems with such multiple encodings ...
... encodings add considerable complexity to user agents: Aside from the obvious efficiency problems with such multiple encodings, they can obscure the basic structure of a message. In particular, they can imply that several decoding operations are necessary simply ...
... to find out what types of bodies a message contains. Banning nested encodings may complicate the job of certain mail gateways, but this seems less of a problem than the effect of nested encodings ...
... encodings may complicate the job of certain mail gateways, but this seems less of a problem than the effect of nested encodings on user agents. ...
... Any entity with an unrecognized Content-Transfer-Encoding must be treated as if it has a Content-Type of "application/octet-stream ...
... ON THE RELATIONSHIP BETWEEN CONTENT-TYPE AND CONTENT-TRANSFER- ENCODING: It may seem that the Content-Transfer-Encoding could be inferred from the characteristics of the media that is to be encoded, ...
... CONTENT-TYPE AND CONTENT-TRANSFER- ENCODING: It may seem that the Content-Transfer-Encoding could be inferred from the characteristics of the media that is to be encoded, or, at the very least, that certain Content-Transfer-Encodings ...
... Content-Transfer-Encoding could be inferred from the characteristics of the media that is to be encoded, or, at the very least, that certain Content-Transfer-Encodings could be mandated for use with specific media types. There are several ...
... reasons why this is not the case. First, given the varying types of transports used for mail, some encodings may be appropriate for some combinations of media types and transports ...
... example, in an 8bit transport, no encoding would be required for text in certain character sets, while such encodings ...
... encoding would be required for text in certain character sets, while such encodings are clearly required for 7bit SMTP.) ...
... Second, certain media types may require different types of transfer encoding under different circumstances. For example, many PostScript bodies might consist entirely of short lines of 7bit data and hence ...
... PostScript bodies might consist entirely of short lines of 7bit data and hence require no encoding at all. Other PostScript bodies (especially those using Level 2 PostScript ...
... PostScript bodies (especially those using Level 2 PostScript's binary encoding mechanism) may only be reasonably represented using a binary transport encoding. ...
... PostScript's binary encoding mechanism) may only be reasonably represented using a binary transport encoding. Finally, since the Content-Type field is intended to be an open-ended ...
... association between media types and encodings effectively couples the specification of an application protocol with a specific lower-level ...
... Translating Encodings ...
... The quoted-printable and base64 encodings are designed so that conversion between them is possible. The only issue that arises in such a conversion is the handling of hard line breaks ...
... such a conversion is the handling of hard line breaks in quoted- printable encoding output. When converting from quoted-printable to base64 ...
... Canonical Encoding Model ...
... affect the treatment of CRLFs, given that the representation of newlines varies greatly from system to system, and the relationship between content-transfer-encodings and character sets. A canonical ...
... character sets. A canonical model for encoding is presented in RFC 2049draft for this reason. ...
... Quoted-Printable Content-Transfer-Encoding ...
... The Quoted-Printable encoding is intended to represent data that largely consists of octets that correspond to printable characters in the US-ASCII ...
... In this encoding, octets are to be represented as determined by the following rules: ...
... ASCII EQUAL SIGN) can be represented by "=3D". This rule must be followed except when the following rules allow an alternative encoding. ...
... CRLF sequence, in the Quoted-Printable encoding. Since the canonical representation of media types ...
... and to be displayed to the user) can occur in the quoted-printable encoding of such types. Sequences like "=0D", "=0A", "=0A=0D" and "=0D=0A" will routinely appear in non-text data represented in quoted- ...
... rather than converting to canonical form first, encoding, and then converting back to local representation. In particular, this may apply to plain text material on systems that use newline conventions ...
... implementation optimization is permissible, but only when the combined canonicalization-encoding step is equivalent to performing the three steps separately. ...
... (Soft Line Breaks) The Quoted-Printable encoding REQUIRES that encoded lines be no more than 76 characters long. If longer lines are to be encoded ...
... characters long. If longer lines are to be encoded with the Quoted-Printable encoding, "soft" line breaks must be used. An equal sign as the last character on a ...
... This can be represented, in the Quoted-Printable encoding, as: ...
... Since the hyphen character ("-") may be represented as itself in the Quoted-Printable encoding, care must be taken, when encapsulating a quoted-printable encoded body inside one or more multipart entities, ...
... NOTE: The quoted-printable encoding represents something of a compromise between readability and reliability in transport ...
... transport. Bodies encoded with the quoted-printable encoding will work reliably over most mail gateways, but may not work perfectly over a few gateways ...
... EBCDIC. A higher level of confidence is offered by the base64 Content-Transfer-Encoding. A way to get reasonably reliable transport through EBCDIC ...
... newline conventions. If such alterations are likely to constitute a corruption of the data, it is probably more sensible to use the base64 encoding rather than the quoted-printable encoding. ...
... base64 encoding rather than the quoted-printable encoding. ...
... NOTE: Several kinds of substrings cannot be generated according to the encoding rules for the quoted-printable content-transfer- encoding ...
... encoding rules for the quoted-printable content-transfer- encoding, and hence are formally illegal if they appear in the output of a quoted-printable encoder ...
... quoted-printable part of a message without itself having been subjected to quoted-printable encoding. A reasonable approach by a robust implementation might be to include the "=" character and the following ...
... found in incoming, encoded data, a robust implementation might nevertheless decode the lines, and might report the erroneous encoding to the user. ...
... Base64 Content-Transfer-Encoding ...
... The Base64 Content-Transfer-Encoding is designed to represent arbitrary sequences of octets in a form that need not be humanly readable. The encoding ...
... Content-Transfer-Encoding is designed to represent arbitrary sequences of octets in a form that need not be humanly readable. The encoding and decoding algorithms are simple, but the encoded data are consistently only about 33 percent larger than the ...
... algorithms are simple, but the encoded data are consistently only about 33 percent larger than the unencoded data. This encoding is virtually identical to the one used in Privacy Enhanced Mail (PEM ...
... versions of EBCDIC. Other popular encodings, such as the encoding used by the uuencode utility, Macintosh ...
... versions of EBCDIC. Other popular encodings, such as the encoding used by the uuencode utility, Macintosh binhex 4.0 [RFC-1741 ...
... Macintosh binhex 4.0 [RFC-1741], and the base85 encoding specified as part of Level 2 PostScript, do not share these properties, and thus do not fulfill the portability ...
... share these properties, and thus do not fulfill the portability requirements a binary transport encoding for mail must meet. ...
... The encoding process represents 24-bit groups of input bits ...
... of which is translated into a single digit in the base64 alphabet. When encoding a bit stream via the base64 encoding, the bit stream ...
... When encoding a bit stream via the base64 encoding, the bit stream must be presumed to be ordered with the most-significant-bit ...
... Base64 Alphabet Value Encoding Value Encoding Value Encoding Value Encoding ...
... Value Encoding Value Encoding Value Encoding Value Encoding ...
... Value Encoding Value Encoding Value Encoding Value Encoding 0 A 17 R 34 i 51 z ...
... Encoding Value Encoding Value Encoding Value Encoding 0 A 17 R 34 i 51 z 1 B 18 S 35 j 52 0 ...
... Special processing is performed if fewer than 24 bits are available at the end of the data being encoded. A full encoding quantum is always completed at the end of a body. When fewer than 24 input bits ...
... base64 input is an integral number of octets, only the following cases can arise: (1) the final quantum of encoding input is an integral multiple of 24 bits; here, the final unit of encoded output will be ...
... 24 bits; here, the final unit of encoded output will be an integral multiple of 4 characters with no "=" padding, (2) the final quantum of encoding input is exactly 8 bits; here, the final unit of encoded output will be two characters followed by two "=" ...
... 8 bits; here, the final unit of encoded output will be two characters followed by two "=" padding characters, or (3) the final quantum of encoding input is exactly 16 bits; here, the final unit of encoded output will be three ...
... Care must be taken to use the proper octets for line breaks if base64 encoding is applied directly to text material that has not been converted to canonical form. In particular, text line breaks ...
... line breaks must be converted into CRLF sequences prior to base64 encoding. The important thing to note is that this may be done directly by the encoder ...
... delimiters within base64-encoded bodies within multipart entities because no hyphen characters are used in the base64 encoding. ...


... Using the MIME-Version, Content-Type, and Content-Transfer-Encoding header fields, it is possible to include, in a standardized way, ...


... "Content-Transfer-Encoding" ":" mechanism ...
... encoding ...



Google
Web
RFC-Ref