OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
 
Subject: Re: idea: content filtering
From: Russ Allbery (rrastanford.edu)
Date: Thu May 11 2000 - 20:29:51 CDT


Bennett Todd <betrahul.net> writes:

> I'd sincerely hope that this wouldn't be necessary; A Perl content
> filter would be one heck of a slow-moving porker to put on the critical
> path for mail delivery.

I think you might be extremely surprised. We use an embedded Perl filter
in INN to do the same thing for news, and I can handle upwards of 20
articles a second on a Sun Ultra 2, doing complicated regex checks and MD5
checksums on around 15-20% of all incoming articles as well as doing
header parsing and checking of every incoming article.

Perl is going to be faster than almost anything except for hand-written C,
and in some benchmarks is faster than C++, provided that you're doing
things that it's quick at (like regular expression matches) and provided
that you keep it resident and don't reload it with every incoming message.

It required some work to get it that fast, and I'd be happy to provide
some tips on how to write that sort of embedded Perl filter and what sort
of glue code you need. But it's *way* easier and generally actually
faster to write your filters in Perl than in C.

-- 
Russ Allbery (rrastanford.edu)             <http://www.eyrie.org/~eagle/>