Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email firstname.lastname@example.org
From: David Wong (david.wongfoundstone.com)
Date: Wed Dec 05 2001 - 09:36:43 CST
Yes, I we definitely need to figure out what the "atoms" of this "class" of
problems are. I propose we get together a list of problem that we can fit
into "atoms". There are going to be some that don't fit at all, but my
guess is that 90% of the problems will fit into 5 - 10 categories. Here's
a first shot, I encourage others to add comments.
- URL encoding
- double encoding
2. Extension handling
- trailing dot
3. Path problems
- Parent paths (../../)
- UNC Shares
- Relative/Absolute filenames
- FAT32 8.3 truncation
This is not a full list, such as the ::$DATA Alternate Data Streams
vulnerability doesn't quite fit.
I know that some of these specific problems have been bugs in specific
web servers. I wanted to point out that I've seen web applications are
susceptible to the same type of attacks. A lot of people think that, oh,
since these vulnerabilities are now fixed and my box is patched, they
aren't a problem anymore. Wrong! We've seen many applications susceptible
to basic URL Hex encoding attacks because they assume the web server
will decode the string for them. This is not necessarily true, especially
code that runs in an ISAPI or Apache filter. Depending on the priority
level that the filters is set on, you may receive raw data or decoded
From: Dennis Groves [mailto:dwgmac.com]
Sent: Tuesday, December 04, 2001 5:30 PM
To: Mark Curphey; David Wong
Subject: Re: (OWASP)Re: Canonicalization representation issues
I would like to say that I look forward to receiving emails like this, what
an excellent idea. However it seems to me that we need to do some thinking
about where this goes, for example parameter manipulation is on its own,
however as a black box tester it seemed to me to be yet another client side
trust issue, and therefore belonged in input validation. However when
thinking about this, I quickly came to the conclusion that parameter
manipulations have a flavor all their own. Now, that being said and your
obvious understanding of our mission, Canonicalization clearly belongs on
its own. Now since we do not want to be in a position of documenting
exploits but rather the "atoms" of security problems is there a way to cover
this class of exploits without documenting every instance of it?
You list 12 items off the top of your head, Yikes! if this is true it is a
bigger issue than even input validation! However, I am guessing that we have
to do some thinking about what the members of the "class" have in common,
and see if we can not reduce that to a set of "atoms" that belong within
> This really makes a lot of sense. Like me, when you break down the attack
> to its raw components, you are able to model in your head all the
> That's what we will be attempting to do with the testing framework and
> xml modeling, that this stuff was setup to support.
> If the tech commitee think its OK I say we should def change it as well.
> ---- David Wong <david.wongfoundstone.com> wrote:
>> I was looking through the Attack Components list for Input Validation
>> and it
>> appears to me that there is a class of attacks not fully addressed.
>> Unicode Strings and URL encoded strings belong in a class of bug that
>> Michael Howard terms "Canonicalization" bugs in his book "Writing Secure
>> Code". There is an entire chapter about Canonical representation issues
>> the book, but I'll try to briefly describe it here for the list.
>> The web application makes security decisions based on a string (a URL,
>> HTTP Header, a Cookie value), and if there the string can be represented
>> another, non-canonical , form, the application can make an incorrect
>> decision. Let's use the two examples above.
>> - Unicode representation, this is classic.
>> canonical form of this URL and the one that is executed by IIS is
>> http://www.victim.com/winnt/system32/cmd.exe?c+dir. So, an incorrect
>> decision was made that cmd.exe was in the /scripts directory and hence
>> executable and it also bypassed the parent paths check.
>> - URL encoded input, a contrived example here is that you could access
>> secure directory by encoding part of the string, such as
>> http://www.victim.com/%73ecure which would have the canonical form
>> My point is, there are many MORE examples of similar type of problems
>> are not part of the Attack Components. Does it make sense to individually
>> list these in the Input Validation section, or create another subsection,
>> possibly canonicalization attacks. The way it is grouped isn't that
>> important, but if look at the problem as Canonical representation bugs,
>> can work to identify all the problems that fall under this category.
>> a list of some similar bugs off the top of my head.
>> - ::$DATA
>> - +.htr
>> - Trailing-dot
>> - UCS-2 Unicode encoding
>> - UTF-8 encoding
>> - Double encoding
>> - ANY type of encoding the app/OS understands. For example, foreign
>> - Dotless IP http://3232286052/ is really http://192.168.197.100
>> - FAT32 filesystem names SECRET~1.TXT can be SECRETFILE.TXT
>> - Relative file names vs. Absolute filenames
>> - UNC file names
>> - \\?\ format in Windows
>> BTW, All credit goes to Michael Howard for the canonicalization
>> classification and his new book is excellent. Most of the bugs are
>> above are
>> described in the book.
>> Dave Wong