OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
 
Re: Input validation

From: Jeremiah Grossman (jeremiahwhitehatsec.com)
Date: Thu Jun 19 2003 - 21:37:48 CDT


On Thu, 2003-06-19 at 10:38, Kooper, Larry wrote:
> I am a newbie to this list - apologies if this question is often asked. (I
> don't know if the list has a FAQ).
>
> When securing a web site against attacks such as SQL injection and XSS, what
> approach do you recommend following to validate user input?
>
> 1) Attempt to massage data so that it becomes valid
> 2) Reject input that is known to be bad
> 3) Accept only input that is known to be good
>
> (The three categories are taken from a paper here-
> http://www.nextgenss.com/papers/advanced_sql_injection.pdf ,p22)
>
> The problem with solutions 1 and 2 is that you may miss some forms of bad
> input. Another subtle problem with solution 1 and 2 is that sometimes bad
> input can be embedded in good input. For example, if someone searches for
> "director's selections" the string "select" would be rejected (as a SQL
> command), resulting in "director's ions."
>
> Solution 3 seems like the most secure but also the most expensive to
> implement. And the problem seems more difficult when validating free-format
> fields such as a name or an address. One could reject non-alphanumeric
> characters, but then things like # (for apartment number) or - (hyphen)
> would be kicked out. Any thoughts?

I personally like #3. Sometimes proper sanity checking can be difficult
to implement in some cases... but maybe less difficult as an alternative
to massaging data back into conformity as suggest by #1. I personally
find the hardest part not the code itself, but remembering to do the
sanity checking on all input and not becoming lazy in the process.

The 3 main things I do when sanity checking input that keeps things safe
are...

Character-Set Check, Length Check, and Escape all input. Making sure I
only get the characters I expect, in the max/min length I expect it, and
always escape all data. Anything else, I kick an error, and never echo
user supplied input.

About the # and - characters, escaping should solve the problem or
simply just allowing a few more characters in your set. Just watch those
meta characters.

for your apt. question, my regex skills on display:
/^([\d#\-]{1,5})$/

regards,

Jer-