The Kermit Project |
612 West 115th Street, New York NY 10025 USA • firstname.lastname@example.org
F. da Cruz, Columbia University[ Kermit FTP Client ] [ FTP Scripting ] [ C-Kermit 8.0 ] [ Kermit Home ]
26 Sep 2002
Most Recent Update: Wed Oct 30 16:51:40 2002
The new releases of C-Kermit (8.0.206) and Kermit 95 (2.1) support new FTP protocol features from RFC 2389 as well as most of what's in the Elz and Hethmon Extensions to FTP Internet Draft (see References). Some of these features, such as SIZE (request a file's size), MDTM (request file's modification time), and REST (restart interrupted transfer) have been widely implemented in FTP clients and servers for years (as well as in the initial release of the Kermit FTP clients). Others such as FEAT and MLSD are rarely seen and are new to the upcoming Kermit releases. TVFS (Trivial Virtual File Store) is supported implicitly, and the UTF-8 character-set is already fully supported at the protocol and data-interchange level.
For Kermit users, the main benefit of the new FTP protocol extensions is the ability to do recursive downloads. But the extensions also introduce complications and tradeoffs that you should be aware of. Of course Kermit tries to "do the right thing" automatically in every case for backwards compatibility. But (as noted later) some cases are inherently ambiguous and/or can result in nasty surprises, and for those situations new commands and switches are available to give you precise control over Kermit's behavior, in case the defaults don't produce the desired results.
The reader is assumed to be familiar with FTP command-line clients and with the Kermit FTP client. If you are not aware of the Kermit FTP client, please CLICK HERE for a brief overview.
Guessing is nice when it works, but sometimes it doesn't, and some FTP servers become confused when you send them a directive they don't understand, or they do something you didn't want, sometimes to the point of closing the connection. For this reason, Kermit lets you override default or negotiated features with the following new commands:
500 'FEAT': command not understood
which is normally harmless (but you never know).
To enable or disable more than one feature, use multiple FTP ENABLE or FTP DISABLE commands. The SHOW FTP command shows which features are currently enabled and disabled.
Whenever a SIZE or MDTM directive is sent implicitly and rejected by the server because it is unknown, Kermit automatically disables it.
With the new FTP protocol extensions, now there are two ways to get the list of files: the NLST directive, which has been part of FTP protocol since the beginning, and the new MLSD directive, which is new and not yet widely implemented. When NLST is used and you give a command like "mget *.txt", the FTP client sends:
and the server sends back a list of the files whose names match, e.g.
foo.txt bar.txt baz.txt
Then when downloading each file, the client must send SIZE (so it can have a percent-done display) and MDTM (if it wants to set the downloaded file's timestamp to match that of the original), as well as RETR (to retrieve the file).
But when MLSD is used, the client is not supposed to send the filename or wildcard to the server; instead it sends an MLSD directive with no argument (or the name of a directory), and the server sends back a list of all the files in the current or given directory; then the client goes through the list and checks each file to see if it matches the given pattern, the rationale being that the user knows only the local conventions for wildcards and not necessarily the server's conventions. So with NLST the server interprets wildcards; with MLSD the client does.
The interpretation of NLST wildcards by the server is not necessarily required or even envisioned by the FTP protocol definition (RFC 959), but in practice all clients and servers work this way.
The principal advantage of MLSD is that instead of sending back a simple list of filenames, it sends back a kind of database in which each entry contains a filename together with information about the file: type, size, timestamp, and so on; for example:
size=0;type=dir;perm=el;modify=20020409191530; bin size=3919312;type=file;perm=r;modify=20000310140400; bar.txt size=6686176;type=file;perm=r;modify=20001215181000; baz.txt size=3820092;type=file;perm=r;modify=20000310140300; foo.txt size=27439;type=file;perm=r;modify=20020923151312; foo.zip (etc etc...)
(If the format of the file list were the only difference between NLST and MLSD, the discussion would be finished: it would always be better to use MLSD when available, and the MGET user interface would need no changes. But there's a lot more to MLSD than the file-list format; read on…)
The client learns whether the server supports MLSD in FEAT exchange. But the fact that the server supports MLSD doesn't mean the client should always use it. It is better to use MLSD:
But it is better to use NLST:
But when using MLSD there are complications:
To further complicate matters, NLST can (in theory) work just like MLSD: if sent with a blank argument or a directory name, it is supposed to return a complete list of files in the current or given directory, which the client can match locally against some pattern. It is not known if any FTP server or client does this but nevertheless, it should be possible since this behavior can be inferred from RFC 959.
In view of these considerations, and given the need to preserve the traditional FTP client command structure and behavior so the software will be usable by most people:
By default, Kermit's MGET command uses MLSD if MLST is reported by the server in its FEAT list. When MLSD is used, the filespec is sent to the server if it is not wild (according to Kermit's own definition of "wild" since it can't possibly know the server's definition). If the filespec is wild it is held for local use to select files from the list returned by the server. If MLST is not reported by the server or is disabled, Kermit sends the MGET filespec with the NLST directive.
The default behavior can be overridden globally with FTP DISABLE MLST, which forces Kermit to use NLST to get file lists. And then for situations in which MLSD is enabled, the following MGET switches can be used to override the defaults for a specific MGET operation:
mget /nlst foo.*
mget /mlsd foo.*
Command: With NLST: With MLSD: mget NLST MLSD mget *.txt NLST *.txt MLSD mget foo NLST foo MLSD foo mget /match:*.txt NLST MLSD mget /match:*.txt foo NLST foo MLSD foo
In other words, the pattern is always interpreted locally unless MGET uses NLST and no /MATCH switch was given.
mget /nlst t.h
In all cases in which the /RECURSIVE switch is included, the server's tree is duplicated locally.
Although MLSD allows recursion and NLST does not, the MLSD specification places a heavy burden on the client; the obvious, straightforward, and elegant implementation (depth-first, the one that Kermit currently uses) requires as many open temporary files as the server's directory tree is deep, and therefore client resource exhaustion -- e.g. exceeding the maximum number of open files -- is a danger. Unfortunately MLSD was not designed with recursion in mind. (Breadth-first traversal could be problematic due to lack of sufficient navigation information.)
Of course all of Kermit's other MGET switches can be used too, e.g. for finer-grained file selection (by date, size, etc), for moving or renaming files as they arrive, to override Kermit's automatic per-file text/binary mode switching, to pass the incoming files through a filter, to convert text-file character sets, and so on.
User's Command FTP Sends Remarks mget /nlst NLST Gets a list of all the files in the server's current and downloads each file. The list includes names only, so Kermit also must send SIZE and MDTM directives if size and timestamp information is required (this is always true of NLST). Sending NLST without an argument is allowed by the RFC959 NLST definition and by the Kermit FTP client, but might not work with other clients, and also might not work with every server. mget /nlst foo NLST foo If "foo" is a directory, this gets a list of all the files from the server's "foo" directory and downloads each file; otherwise this downloads the file named "foo" (if any) from the server's current directory. mget /nlst *.txt NLST *.txt Gets a list of the files in the server's current directory whose names match the pattern *.txt, and then downloads each file from the list. Because we are using NLST, we send the filespec (*.txt) to the server and the server interprets any wildcards. mget /nlst foo/*.txt NLST foo/*.txt Gets a list of the files in the server's "foo" directory whose names match the pattern *.txt, and then downloads each file from the list (server interprets wildcards). mget /nlst /match:*.txt NLST Gets a list of all the files in the server's current directory and then downloads each one whose name matches the pattern *.txt (client interprets wildcards). mget /nlst /match:*.txt foo NLST foo Gets a list of all the files in the server's "foo" directory and then downloads each one whose name matches the pattern *.txt (client interprets wildcards). mget /mlsd MLSD Gets a list of all the files from the server's current directory and then downloads each one. The list might include size and timestamp information, in which case Kermit does not need to send SIZE and MDTM directives for each file (this is always true of MLSD). mget /mlsd foo MLSD foo Gets a list of all the files from the server's "foo" directory (where the string "foo" does not contain wildcards) and then downloads each one. If "foo" is a regular file and not a directory, this command is supposed to fail, but some servers have been observed that send the file. mget /mlsd *.txt MLSD Gets a list of all the files from the server's current directory and then downloads only the ones whose names match the pattern "*.txt". Because we are using MLSD and the MGET filespec is wild, we do not send the filespec to the server, but treat it as though it had been given in a /MATCH: switch and use it locally to match the names in the list. mget /mlsd foo/*.txt MLSD This one won't work because MLSD requires that the notions of server directory and filename-matching pattern be separated. However, the client, which can't be expected to know the server's file-system syntax, winds up sending a request that the server will (or should) reject. mget /mlsd /match:*.txt MLSD Gets a list of all the files from the server's current directory and then downloads only the ones whose names match the pattern "*.txt" (client interprets wildcards). mget /mlsd /match:*.txt foo MLSD foo If "foo" is a directory on the server, this gets a list of all the files from the server's "foo" directory and then downloads only the ones whose names match the pattern "*.txt" (client interprets wildcards). This leaves the server CD'd to the "foo" directory; there's no way the client can restore the server's original directory because MLSD doesn't give that information, and since the client can not be expected to know the server's file-system syntax, it would not be safe to guess. If "foo" is a regular file, MLSD fails. mget /mlsd foo bar MLSD This one is problematic. You're supposed to be able to give MGET a list a filespecs; in this case we name two directories. The client must change the server's directory to "foo" to get the list of files, and then the files themselves. But then it has no way to return to the server's previous directory in order to do the same for "bar", as explained in the previous example. mget /mlsd /match:* [abc] MLSD [abc] Including a /MATCH: switch forces [abc] to be sent to the server even though the client would normally think it was a wildcard and hold it for local interpretation. In this example, [abc] might be a VMS directory name. mget /mlsd /match:* t*.h MLSD t*.h Contrary to the MLSD specification, all MLSD-capable FTP servers I've encountered so far do interpret wildcards. This form of the MGET command can be used to force a wildcard to be sent to the server for interpretation.
When MLSD is used implicitly (that is, without an /MLSD switch given to force the use of MLSD) and an MGET command such as "mget foo/*.txt" fails, Kermit automatically falls back to NLST and tries again.