RFC 3492:Punycode: A Bootstring encoding of Unicod...
RFC-Ref

1. Introduction

   [IDNA] describes an architecture for supporting internationalized
   domain names.  Labels containing non-ASCII characters can be
   represented by ACE labels, which begin with a special ACE prefix and
   contain only ASCII characters.  The remainder of the label after the
   prefix is a Punycode encoding of a Unicode string satisfying certain
   constraints.  For the details of the prefix and constraints, see
   [IDNA] and [NAMEPREP].

   Punycode is an instance of a more general algorithm called
   Bootstring, which allows strings composed from a small set of "basic"
   code points to uniquely represent any string of code points drawn
   from a larger set.  Punycode is Bootstring with particular parameter
   values appropriate for IDNA.

1.1. Features

   Bootstring has been designed to have the following features:

   *  Completeness:  Every extended string (sequence of arbitrary code
      points) can be represented by a basic string (sequence of basic
      code points).  Restrictions on what strings are allowed, and on
      length, can be imposed by higher layers.

   *  Uniqueness:  There is at most one basic string that represents a
      given extended string.

   *  Reversibility:  Any extended string mapped to a basic string can
      be recovered from that basic string.

   *  Efficient encoding:  The ratio of basic string length to extended
      string length is small.  This is important in the context of
      domain names because RFC 1034std13 [RFC1034] restricts the length of a
      domain label to 63 characters.

   *  Simplicity:  The encoding and decoding algorithms are reasonably
      simple to implement.  The goals of efficiency and simplicity are
      at odds; Bootstring aims at a good balance between them.

   *  Readability:  Basic code points appearing in the extended string
      are represented as themselves in the basic string (although the
      main purpose is to improve efficiency, not readability).

   Punycode can also support an additional feature that is not used by
   the ToASCII and ToUnicode operations of [IDNA].  When extended
   strings are case-folded prior to encoding, the basic string can use
   mixed case to tell how to convert the folded string into a mixed-case
   string.  See appendix A "Mixed-case annotation".

1.2. Interaction of protocol parts

   Punycode is used by the IDNA protocol [IDNA] for converting domain
   labels into ASCII; it is not designed for any other purpose.  It is
   explicitly not designed for processing arbitrary free text.

Google
Web
RFC-Ref