OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
 
Subject: Re: Content-Length (was Re: Banner)
From: Liviu Daia (Liviu.Daiaimar.ro)
Date: Wed Jun 21 2000 - 17:18:50 CDT


On 21 June 2000, Wietse Venema <wietseporcupine.org> wrote:
> Liviu Daia:
> > > > - Drop Content-Length:.
> > >
> > > That's good. But should it be configurable?
> >
> > It should be, who knows what devious scripts depend on it.
> >
> > On a second thought, there's also another way to look at it:
> > in MIME-related legalese, the body of a message is an attachment,
> > and "Content-Length" is an attribute of that attachment. [*Sigh*]
> > Whoever invented 7-bit channels and their logical consequence
> > content encoding, deserves to die a slow, painful death.
>
> You lost me. How can a sender know the exact number of bytes of mail
> that is stored on a different machine, or does the count assume that
> everyone everywhere stores mail as CRLF delimited text? What about the
> Contrl-Z at the end? :-)

    Ok, let me try again. What I'm saying here is that "Content-Length"
has two incompatible meanings these days, each of them having its own
(additional) inconsistencies:

(1) The old meaning: "Content-Length" is the length of the body of the
message, according to local system's quirks (CR-LF, Ctrl-Z, zero padding
on VMS etc.). This is the one that started this thread.

    The rationale for using it is that when the body of the message
is saved to a text file ("text" here being a format dependent on the
underlying system), the resulting file would have the length indicated
by "Content-Length". However, since "text" means different things on
different systems, and since no information about the originating OS is
included with mail messages, passing "Content-Length" to remote hosts
may (and does) cause havoc to all filter programs that pay attention to
it. Put this way, removing "Content-Length" would be a Good Thing [TM].

(2) The MIME meaning. Consider an "image/jpeg" attachment. Normally,
it would be encoded as Base64, and it would have a "Content-Length"
indicating the length of the actual image file. Nothing wrong so far.
Now consider the same attachment as the body of a message. This is
legal, because RFC 2045 specifically allows _any_ kind of attachment as
the body (so that "old-style", non-MIME messages could be view as MIME
ones with a single attachment; in fact, according to RFC 2045, there's
no formal difference between attachments and message bodies --- and
indeed, some Microsoft mailers creatively take advantage of this brain
damage by allowing lusers to send messages with binary-only bodies,
without enclosing them in a "multipart/mixed" or whatever). In that
setting, the "Content-Length" of our "image/jpeg" would be promoted to a
_message_ header --- and removing it would mean removing an attribute of
the attachment.

    Things don't get any better if you restrict to "text/*" attachments
/ bodies, since none of the MIME-related RFCs specify which "local"
re-encodings are legal and which aren't. A Windows mailer f.i. is
likely to translate LFs to CR-LFs if a "text/plain" attachment is
encoded as "us-ascii", but it might choose to leave LFs alone if
the same attachment was sent encoded as "Base64". Basically, real,
unadulterated crap.

    I hope this somewhat clarifies my point.

    Regards,

    Liviu Daia

-- 
Dr. Liviu Daia               e-mail:   Liviu.Daiaimar.ro
Institute of Mathematics     web page: http://www.imar.ro/~daia
of the Romanian Academy      PGP key:  http://www.imar.ro/~daia/daia.asc