|
Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com |
From: Sverre H. Huseby (shh_at_thathost.com)
Date: Wed Oct 16 2002 - 15:48:52 CDT
[b0iler]
| Also, you are just sending the inputed values of parameters. What
| about the names of the parameter (the $key variables)? They could
| contain potentially dangerous XSS which is often printed to the
| client. Also, user input (GPC) is not the only tainted data in a
| script. Any data that comes from an outside source is potientally
| dangerous. Files, databases, ENV variables, etc.. need to be
| treated as if it contains the most clever tricks to evade your
| filtering and protection schemes.
Correct. And I've tried to say the same quite a few times on several
securityfocus lists the last two years.
We need to shift the focus away from _input_. Input is never trouble-
some in itself. It first gets troublesome when put in a context in
which it is interpreted in some way. And then again only when parts
of it will not be interpreted as plain data, but as something else.
As b0iler (whoever that is :) ) correctly states above, data from the
inside may cause just as much trouble as data from the outside. And
it may do so deep inside a multi-tier system, far from the web layer.
It's when data is passed somewhere for interpretation that it gets
troublesome. We should thus pay attention to the format of the data
whenever we _pass_it_along_, rather than when we receive it from the
outside. Web applications tend to pass data along all the time:
* to database servers, often by concatenating the data with strings
containing SQL constructs, or by using some kind of prepared
statement mechanism (much better).
* to shell command interpreters (yikes!).
* to the OS by sending file names to file handling functions, host
names to name resolutions libraries and so on. (a large amount of
"so on" for the OS.)
* to legacy systems written in some obscure language using some equ-
ally obscure protocol.
* to other web servers (B2B) using XML, URL parameters or whatever.
* to other processes running on the same server, using some
internally made protocol.
* many, many, many more...
* and last, but not least, to the web browser of the user. Which
luckily is just another sub-system, covered by the same rule as
the rest.
And to repeat: "Data" is not only user input. It is anything, no mat-
ter the source. Every system we pass data to has its own way of
interpreting it, and the interpretation depends on context. Some
examples:
* when building strings containing SQL queries, the quote character
may cause trouble if it appears prematurely in an SQL string con-
stant. _Any_ data passed as part of an SQL statement _that_is
_to_be_interpreted_as_a_string_constant_ will need to have quotes
escaped in some way. (No, we can't generally forbid quotes. How
would I be able to write "can't" a few words back if you forbid
the quote?) And no, we can't generally escape quotes at input
time either, because then they will look rather funny for the
_other_ sub-systems, in which quotes have no special meaning
(eg. a text file or the user's browser).
For more on this, see another vuln-dev-mail of mine available
here:
http://shh.thathost.com/text/passing-data-03.txt
* when talking to the OS, null-bytes may create confusion when pass-
ing strings, as the OS (written in C, normally) treats the '\0' as
a string terminator. Most "modern" languages do not. We'll gen-
erally need to pay attention to null-bytes when talking to sub-
systems written in C. The reason is generally that our view of
the string will differ from the view taken by the OS.
But there are other things as well. If we pass a _file_name_ to
the OS, we may need to pay attention to slashes (and for some ob-
scure OSes, backslashes) and double-dots as well, as they will
switch context from _file_ to _directory_.
And hundreds of other examples on how talking to one particular
sub-function (eg. open()) of a sub-system (eg. the OS) will need
careful handling of a selected set of characters.
* and then comes the browser again. The HTML parser in the browser
gives special meaning to < (tag start) , > (tag end) and &
(character entity). And if inside those < and >, suddenly " and '
(both attribute value encapsulators) may have a special meaning
too. We'll need to escape them somehow, so that they are not
treated as special characters, but rather as plain characters.
The correct way is to use HTML encoding (as most of you know).
The wrong way (generally) is to replace the special characters
with nothing. Imagine all the complaints you will get if you make
a discussion forum for mathematicians, and disallow < and > ...
It is generally _not_possible_ to fetch data from the request and
start by doing something to it that will match all the possible sub-
systems in one go. Not without giving severe restrictions as to what
the data may contain. ("Sorry, Sinead, but your name will have to be
OConnor for now"). And not without introducing strange appearances
for some of the sub-systems. ("Welcome, Sinead O\'Connor").
Input validation has been given _far_ to much focus. It may be good
as a first measure, to be able to give users nice feedback when data
don't match the business rules and other high level rules ("the file
name is not supposed to contain directory elements"), but it generally
won't solve the low level problems. In systems over toy size, data is
passed between many different sub-systems, which often have different
meta-characters that may be abused. People who believe that input
validation at the web layer will avoid security problems several lay-
ers down below (or when data come back to the first layer again), have
given the issue too little thought, IMNSHO.
Focus on input validation, but focus even more on handling every poss-
ible meta-character, meta-byte, meta-word or whatever before passing
the data along to the next sub-system, whatever that is. And that
rule goes for every layer of the application, not just the web layer.
Sverre - who feels this discussion would fit better at webappsec than
at vuln-dev.
-- shhthathost.com Computer Geek? Try my Nerd Quiz http://shh.thathost.com/ http://nerdquiz.thathost.com/
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
thathost.com Computer Geek? Try my Nerd Quiz