|
Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com |
From: auto125268
hushmail.comDate: Mon Jul 23 2001 - 21:04:35 CDT
Sure but URL parsing is just part of the issue isn't it ? When you parse a URL, you first have to convert it back from URL encoding and then Unicode it before you can do any operators....for URL encoding the issue is reserved characters, ASCII control characters, non-ASCII characters and unsafe characters .
The specification for URLs (RFC 1738) is a problem, becasue it limits the use of allowed characters in URLs to only a limited subset of the US-ASCII character set:
"...Only alphanumerics [0-9a-zA-Z], the special characters "$-_.+!*'()," [not including the quotes - ed], and reserved characters used for their reserved purposes may be used unencoded within a URL."
URLs use some characters for special use in defining their syntax. When these characters are not used in their special role inside a URL, they need to be encoded, ";", "/", "?", ":", "
", "=" and "&"
ASCII control characters are not-printable. Includes the ISO-8859-1 (ISO-Latin) character ranges 00-1F hex (0-31 decimal) and 7F (127 decimal).
Non-ASCII characters by definition not legal in URLs since they are not in the ASCII set. Includes the entire "top half" of the ISO-Latin set 80-FF hex (128-255 decimal)
Space - Significant sequences of spaces may be lost in some uses (especially multiple spaces)
Quotation marks
'Less Than' symbol ("<")
'Greater Than' symbol (">") - These characters are often used to delimit URLs in plain text.
'Pound' character ("#")- This is used in URLs to indicate where a fragment identifier (bookmarks/anchors in HTML) begins.
Percent character ("%")- This is used to URL encode/escape other characters, so it should itself also be encoded.
Left Curly Brace ("{")
Right Curly Brace ("}")
Vertical Bar/Pipe ("|")
Backslash ("\")
Caret ("^")
Tilde ("~")
Left Square Bracket ("[")
Right Square Bracket ("]")
Grave Accent ("`") - Some systems can possibly modify these characters.
Given that I am not clear how that makes thge Unicode problem any easier as you have to do all of that before Unicode conversation. What am I missing ?
> --- auto125268
hushmail.com a écrit :
> > Unicode - Given all Java internals are dealt with in Unicode
> > is there an exposure here ?
>
> Yes there is. The URL parsing is easier but if you forget to check for
> /../ sequences, you'll be vulnerable.
>
> > I would have thought that a Java web server would
> > be immune to the Unicode bugs that have affected IIS ?
>
> The only way to be "100% immune" to web directory traversal bugs is to
> "chroot" your server.
Free, secure Web-based email, now OpenPGP compliant - www.hushmail.com
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]