RFC 3659:Extensions to FTP
RFC-Ref

2. Document Conventions


   This document makes use of the document conventions defined in BCP
   14, RFC 2119 [4].  That provides the interpretation of capitalized
   imperative words like MUST, SHOULD, etc.

   This document also uses notation defined in STD 9, RFC 959std9 [3].  In
   particular, the terms "reply", "user", "NVFS" (Network Virtual File
   System), "file", "pathname", "FTP commands", "DTP" (data transfer
   process), "user-FTP process", "user-PI" (user protocol interpreter),
   "user-DTP", "server-FTP process", "server-PI", "server-DTP", "mode",
   "type", "NVT" (Network Virtual Terminal), "control connection", "data
   connection", and "ASCII", are all used here as defined there.

   Syntax required is defined using the Augmented BNF defined in [5].
   Some general ABNF definitions that are required throughout the
   document will be defined later in this section.  At first reading, it
   may be wise to simply recall that these definitions exist here, and
   skip to the next section.


2.1. Basic Tokens


   This document imports the core ABNF definitions given in Appendix A
   of [5].  There definitions will be found for basic ABNF elements like
   ALPHA, DIGIT, SP, etc.  The following terms are added for use in this
   document.

      TCHAR          = VCHAR / SP / HTAB    ; visible plus white space
      RCHAR          = ALPHA / DIGIT / "," / "." / ":" / "!" /
                       "@" / "#" / "$" / "%" / "^" /
                       "&" / "(" / ")" / "-" / "_" /
                       "+" / "?" / "/" / "\" / "'" /
                       DQUOTE   ; <"> -- double quote character (%x22)
      SCHAR          = RCHAR / "=" ;

   The VCHAR (from [5]), RCHAR, SCHAR, and TCHAR types give basic
   character types from varying sub-sets of the ASCII character set for
   use in various commands and responses.

      token          = 1*RCHAR

   A "token" is a string whose precise meaning depends upon the context
   in which it is used.  In some cases it will be a value from a set of
   possible values maintained elsewhere.  In others it might be a string
   invented by one party to an FTP conversation from whatever sources it
   finds relevant.

   Note that in ABNF, string literals are case insensitive.  That
   convention is preserved in this document, and implies that FTP
   commands added by this specification have names that can be
   represented in any case.  That is, "MDTM" is the same as "mdtm",
   "Mdtm" and "MdTm" etc.  However note that ALPHA, in particular, is
   case sensitive.  That implies that a "token" is a case sensitive
   value.  That implication is correct, except where explicitly stated
   to the contrary in this document, or in some other specification that
   defines the values this document specifies be used in a particular
   context.


2.2. Pathnames


   Various FTP commands take pathnames as arguments, or return pathnames
   in responses.  When the MLST command is supported, as indicated in
   the response to the FEAT command [6], pathnames are to be transferred
   in one of the following two formats.

      pathname       = utf-8-name / raw
      utf-8-name     = <a UTF-8 encoded Unicode string>
      raw            = <any string that is not a valid UTF-8 encoding>

   Which format is used is at the option of the user-PI or server-PI
   sending the pathname.  UTF-8 encodings [2] contain enough internal
   structure that it is always, in practice, possible to determine
   whether a UTF-8 or raw encoding has been used, in those cases where
   it matters.  While it is useful for the user-PI to be able to
   correctly display a pathname received from the server-PI to the user,
   it is far more important for the user-PI to be able to retain and
   retransmit the identical pathname when required.  Implementations are
   advised against converting a UTF-8 pathname to a local charset that
   isn't capable of representing the full Unicode character repertoire,
   and then attempting to invert the charset translation later.  Note
   that ASCII is a subset of UTF-8.  See also [1].

   Unless otherwise specified, the pathname is terminated by the CRLF
   that terminates the FTP command, or by the CRLF that ends a reply.
   Any trailing spaces preceding that CRLF form part of the name.
   Exactly one space will precede the pathname and serve as a separator
   from the preceding syntax element.  Any additional spaces form part
   of the pathname.  See [7] for a fuller explanation of the character
   encoding issues.  All implementations supporting MLST MUST support
   [7].

   Note: for pathnames transferred over a data connection, there is no
   way to represent a pathname containing the characters CR and LF in
   sequence, and distinguish that from the end of line indication.
   Hence, pathnames containing the CRLF pair of characters cannot be
   transmitted over a data connection.  Data connections only contain
   file names transmitted from server-FTP to user-FTP as the result of
   one of the directory listing commands.  Files with names containing
   the CRLF sequence must either have that sequence converted to some
   other form, such that the other form can be recognised and be
   correctly converted back to CRLF, or be omitted from the listing.

   Implementations should also beware that the FTP control connection
   uses Telnet NVT conventions [8], and that the Telnet IAC character,
   if part of a pathname sent over the control connection, MUST be
   correctly escaped as defined by the Telnet protocol.

   NVT also distinguishes between CR, LF, and the end of line CRLF, and
   so would permit pathnames containing the pair of characters CR and LF
   to be correctly transmitted.  However, because such a sequence cannot
   be transmitted over a data connection (as part of the result of a
   LIST, NLST, or MLSD command), such pathnames are best avoided.

   Implementors should also be aware that, although Telnet NVT
   conventions are used over the control connections, Telnet option
   negotiation MUST NOT be attempted.  See section 4.1.2.12 of [9].


2.2.1. Pathname Syntax


   Except where TVFS is supported (see section 6), this specification
   imposes no syntax upon pathnames.  Nor does it restrict the character
   set from which pathnames are created.  This does not imply that the
   NVFS is required to make sense of all possible pathnames.  Server-PIs
   may restrict the syntax of valid pathnames in their NVFS in any
   manner appropriate to their implementation or underlying file system.
   Similarly, a server-PI may parse the pathname and assign meaning to
   the components detected.


2.2.2. Wildcarding


   For the commands defined in this specification, all pathnames are to
   be treated literally.  That is, for a pathname given as a parameter
   to a command, the file whose name is identical to the pathname given
   is implied.  No characters from the pathname may be treated as
   special or "magic", thus no pattern matching (other than for exact
   equality) between the pathname given and the files present in the
   NVFS of the server-FTP is permitted.

   Clients that desire some form of pattern matching functionality must
   obtain a listing of the relevant directory, or directories, and
   implement their own file name selection procedures.


2.3. Times


   The syntax of a time value is:

      time-val       = 14DIGIT ["." 1*DIGIT]

   The leading, mandatory, fourteen digits are to be interpreted as, in
   order from the leftmost, four digits giving the year, with a range of
   1000--9999, two digits giving the month of the year, with a range of
   01--12, two digits giving the day of the month, with a range of
   01--31, two digits giving the hour of the day, with a range of
   00--23, two digits giving minutes past the hour, with a range of
   00--59, and finally, two digits giving seconds past the minute, with
   a range of 00--60 (with 60 being used only at a leap second).  Years
   in the tenth century, and earlier, cannot be expressed.  This is not
   considered a serious defect of the protocol.

   The optional digits, which are preceded by a period, give decimal
   fractions of a second.  These may be given to whatever precision is
   appropriate to the circumstance, however implementations MUST NOT add
   precision to time-vals where that precision does not exist in the
   underlying value being transmitted.

   Symbolically, a time-val may be viewed as

      YYYYMMDDHHMMSS.sss

   The "." and subsequent digits ("sss") are optional.  However the "."
   MUST NOT appear unless at least one following digit also appears.

   Time values are always represented in UTC (GMT), and in the Gregorian
   calendar regardless of what calendar may have been in use at the date
   and time indicated at the location of the server-PI.

   The technical differences among GMT, TAI, UTC, UT1, UT2, etc., are
   not considered here.  A server-FTP process should always use the same
   time reference, so the times it returns will be consistent.  Clients
   are not expected to be time synchronized with the server, so the
   possible difference in times that might be reported by the different
   time standards is not considered important.


2.4. Server Replies


   Section 4.2 of [3] defines the format and meaning of replies by the
   server-PI to FTP commands from the user-PI.  Those reply conventions
   are used here without change.

      error-response = error-code SP *TCHAR CRLF
      error-code     = ("4" / "5") 2DIGIT

   Implementors should note that the ABNF syntax used in this document
   and in other FTP related documents (but not used in [3]), sometimes
   shows replies using the one-line format.  Unless otherwise explicitly
   stated, that is not intended to imply that multi-line responses are
   not permitted.  Implementors should assume that, unless stated to the
   contrary, any reply to any FTP command (including QUIT) may use the
   multi-line format described in [3].

   Throughout this document, replies will be identified by the three
   digit code that is their first element.  Thus the term "500 reply"
   means a reply from the server-PI using the three digit code "500".


2.5. Interpreting Examples


   In the examples of FTP dialogs presented in this document, lines that
   begin "C> " were sent over the control connection from the user-PI to
   the server-PI, lines that begin "S> " were sent over the control
   connection from the server-PI to the user-PI, and each sequence of
   lines that begin "D> " was sent from the server-PI to the user-PI
   over a data connection created just to send those lines and closed
   immediately after.  No examples here show data transferred over a
   data connection from the client to the server.  In all cases, the
   prefixes shown above, including the one space, have been added for
   the purposes of this document, and are not a part of the data
   exchanged between client and server.



Google
Web
RFC-Ref