RFC 3492:Punycode: A Bootstring encoding of Unicod...
RFC-Ref

unicode


Click on the red underlined text to get to the source

... prefix is a Punycode encoding of a Unicode string satisfying certain constraints. For the details of the prefix ...


... character set. As in the Unicode Standard [UNICODE], Unicode code points ...
... As in the Unicode Standard [UNICODE], Unicode code points are denoted by "U+" followed by four to six hexadecimal digits, while a range ...


... Although the only restriction Punycode imposes on the input integers is that they be nonnegative, these parameters are especially designed to work well with Unicode [UNICODE] code points, which are integers ...
... use by the UTF-16 encoding of Unicode). The basic code points are the ASCII ...
... Using hyphen-minus as the delimiter implies that the encoded string can end with a hyphen-minus only if the Unicode string consists entirely of basic code points, but IDNA ...


... DNS to be controlled by a single authority. If a Unicode string intended for use as a domain label could map to multiple ACE ...
... service requests intended for another. Therefore Punycode is designed so that each Unicode string has a unique encoding. ...
... encoding. However, there can still be multiple Unicode representations of the "same" text, for various definitions of "same". This problem is addressed to some extent by the Unicode ...
... Unicode representations of the "same" text, for various definitions of "same". This problem is addressed to some extent by the Unicode standard under the topic of canonicalization, and this work is leveraged for domain names ...


... The Unicode Consortium, "The Unicode Standard", http://www.unicode.org/unicode/standard/standard.html . ...
... The Unicode Consortium, "The Unicode Standard", http://www.unicode.org/unicode/standard/standard.html . ...
... http://www.unicode.org/unicode/standard/standard.html ...
... http://www.unicode.org/unicode/standard/standard.html ...


... char output[] ); /* punycode_encode() converts Unicode to Punycode. The input */ /* is represented as an array of Unicode code points ...
... /* punycode_encode() converts Unicode to Punycode. The input */ /* is represented as an array of Unicode code points (not code */ /* units; surrogate pairs are not allowed), and the output */ ...
... /* holds input_length boolean values, where nonzero suggests that */ /* the corresponding Unicode character be forced to uppercase */ /* after being decoded (if possible), and zero suggests that */ /* it be forced to lowercase (if possible). ASCII ...
... unsigned char case_flags[] ); /* punycode_decode() converts Punycode to Unicode. The input is */ /* represented as an array of ASCII code points ...
... ASCII code points, and the output */ /* will be represented as an array of Unicode code points. The */ /* input_length is the number of code points ...
... /* least output_length values, or it can be a null pointer if the */ /* case information is not needed. A nonzero flag suggests that */ /* the corresponding Unicode character be forced to uppercase */ /* by the caller (if possible), while zero suggests that it be */ ...
... enum { unicode_max_length = 256, ace_max_length = 256 }; ...
... int r; unsigned int input_length, output_length, j; unsigned char case_flags[unicode_max_length]; if (argc != 2) usage(argv); ...
... if (argv[1][1] == 'e') { punycode_uint input[unicode_max_length]; unsigned long codept; char output[ace_max_length+1], uplus[3]; ...
... } if (input_length == unicode_max_length) fail(too_big); if (uplus[0] == 'u') case_flags[input_length] = 0; ...
... if (argv[1][1] == 'd') { char input[ace_max_length+2], *p, *pp; punycode_uint output[unicode_max_length]; /* Read the Punycode input string and convert to ASCII ...
... /* Decode: */ output_length = unicode_max_length; status = punycode_decode(input_length, input, &output_length, output, case_flags); ...



Google
Web
RFC-Ref