HTML
Click on the red underlined text to get to the source
... create hypertext documents that are platform independent. Initially,
the application of HTML on the World Wide Web was seriously
restricted by its reliance on the ISO ...
... which is appropriate only for Western European languages. Despite
this restriction, HTML has been widely used with other languages,
using other coded character sets ...
... address the issue of the
internationalization of HTML by extending the specification of HTML
and giving additional recommendations for proper internationalization ...
... internationalization of HTML by extending the specification of HTML
and giving additional recommendations for proper internationalization
...
... on multilingualism on the WWW [NICOL]. A foremost consideration is
to make sure that HTML remains a valid application of SGML, while
...
... SGML document character set to
be used for HTML, the proper treatment of the charset parameter
associated with the "text/html" content type ...
...
HTML has been in use by the World-Wide Web (WWW) global information
initiative since 1990. This specification extends the capabilities
of HTML ...
... HTML has been in use by the World-Wide Web (WWW) global information
initiative since 1990. This specification extends the capabilities
of HTML 2.0 (RFC 1866hist(-> 2854)), primarily by removing the restriction to the
...
... ISO-8859].
HTML is an application of ISO Standard 8879:1986, Information
Processing Text and Office Systems -- Standard Generalized Markup
Language ...
... Type Definition (DTD)
is a formal definition of the HTML syntax in terms of SGML. This
specification amends the DTD ...
... SGML. This
specification amends the DTD of HTML 2.0 in order to make it
applicable to documents encompassing a character repertoire much
...
... conformant.
Both formal and actual development of HTML are advancing very fast.
The features described in this document are designed so that they can
(and should) be added to other forms of HTML ...
... HTML are advancing very fast.
The features described in this document are designed so that they can
(and should) be added to other forms of HTML besides that described
in RFC 1866hist(-> 2854). Where indicated, attributes introduced here should be
...
... This specification changes slightly the conformance requirements of
HTML documents and HTML user agents.
...
...
All HTML 2.0 conforming documents remain conforming with this
specification. However, the extensions introduced here make valid
...
... specification. However, the extensions introduced here make valid
certain documents that would not be HTML 2.0 conforming, in
particular those containing characters or character references
outside of the repertoire of ISO 8859-1 ...
... user agents MUST correctly interpret
the charset parameter accompanying an HTML document received from
the network.
...
...
This overview explains a reference processing model used for HTML,
and in particular the SGML concept of a document character set ...
... SGML document, and it should be carefully
distinguished from the document character set of the abstract HTML
document. SGML views the characters as a single set (called a
"character repertoire ...
... question of the external character encoding. This is deferred to
mechanisms external to HTML, such as MIME as used by the HTTP
...
... 6.
Similarly, if HTML documents are transferred by electronic mail, the
external character encoding ...
... No mechanisms are currently standardized for indicating the external
character encoding of HTML documents transferred by FTP or accessed
in distributed file systems ...
... file systems.
In the case any other way of transferring and storing HTML documents
are defined or become popular, it is advised that similar provisions
be made to clearly identify the character encoding ...
... character set
specified in Section 2.2 before processing specific to SGML/HTML.
The reference processing model can be depicted as follows:
...
... entity manager, the parser, and the application, as far as character
semantics are concerned, are using the HTML document character set
only.
...
... character set implies a change in the
SGML declaration specified in the HTML 2.0 specification (section 9.5
of [RFC1866]). The change amounts to removing ...
... create non-
conformance of any expression, construct or document that is
conforming to HTML 2.0. It does make conforming certain constructs
that are not admissible in HTML 2.0. One consequence is that data
...
... conforming to HTML 2.0. It does make conforming certain constructs
that are not admissible in HTML 2.0. One consequence is that data
characters outside the repertoire of ISO-8859-1, but within that of
...
...
NOTE -- the above SGML declaration, like that of HTML 2.0,
specifies the character numbers 128 to 159 (80 to 9F hex) as
UNUSED. This means that numeric character references within that
...
... UNUSED. This means that numeric character references within that
range (e.g. ’) are illegal in HTML. Neither ISO 8859-1 nor
ISO 10646 ...
... control characters.
Another change was made from the HTML 2.0 SGML declaration, in the
belief that the latter did not express its authors' true intent. The
...
... identical with US-ASCII. In principle, this introduces an
incompatibility with HTML 2.0, but in practice it should increase
interoperability by i) having the SGML ...
... character set that could take its
place as the document character set for HTML. If nevertheless for a
specific application there is a need to use characters outside this
...
...
Since any text can logically be assigned a language, almost all HTML
elements admit the LANG attribute. The DTD ...
... elements in this version of HTML without the LANG attribute are BR,
HR, BASE, NEXTID, and META. It is also intended that any new element ...
... element
introduced in later versions of HTML will admit the LANG attribute,
unless there is a good reason not to do so.
...
...
The syntax and registry of HTML language tags is the same as that
defined by RFC 1766(-> 3282draft | 3066(-> 4647 | 4646)) ...
... semantics, where applicable, are identical to [UNICODE],
and ii) where functionality is moved to HTML as a higher level
protocol, this is done in a way that allows straightforward
conversion to the lower-level mechanisms defined in [UNICODE ...
... this. It is also intended that any new element introduced in later
versions of HTML will admit the DIR attribute, unless there is a good
reason not to do so.
...
...
NOTE -- RFC 1866hist(-> 2854) section 4.2.2 specifies that an HTML user agent
should treat an end of line as a word space, except in
...
... BIDI markup in the form of special-purpose formatting characters.
This is also possible in HTML, which includes the five BIDI-related
formatting characters (202A - 202E) of ISO 10646. As an alternative,
...
... formatting characters (202A - 202E) of ISO 10646. As an alternative,
HTML provides equivalent SGML markup.
...
... from the parent element. The default directionality of the overall
HTML document is left-to-right.
On inline elements ...
... integrity,
and alleviates some problems when editing bidirectional HTML text
with a simple text editor, but some software may be more apt at
using the 10646 characters. If both methods ...
... primarily a UI issue, there are some things that should be specified
at the HTML level to guide behavior and promote interoperability.
...
...
The HTML 2.0 form submission mechanism, based on the "application/x-
www-form-urlencoded" media type, is ill-equipped with regard to
...
...
In the case where a document is accessed from a hyperlink in an
origin HTML document, a CHARSET attribute is added to the attribute
list of elements ...
... HTML Public Text ...
... SGML Document Access (SDA) Parameter Entities =====-->
<!-- HTML contains SGML Document Access (SDA) fixed attributes
in support of easy transformation to the International Committee ...
... Flows ======================-->
<