OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
 
Re: [Dailydave] approximate string matching

From: Mateusz Berezecki (mateuszbgmail.com)
Date: Fri Sep 01 2006 - 06:48:50 CDT


Hello Arun,

On 9/1/06, Arun Koshy <arunkoshygmail.com> wrote:
> On 9/1/06, Mateusz Berezecki <mateuszbgmail.com> wrote:
> > Is anyone aware of a good implementation of any of these algorithms
> > in C or perhaps some opensource C library for that purpose?
> > Do you have any recommendations?
>
> Check :
>
> http://www.dcs.shef.ac.uk/~sam/stringmetrics.html#jaccard
>
> The above links into a sourceforge project that has an implementation
>
> http://sourceforge.net/projects/simmetrics/
>
> Hope that helps
>

Well, sort of :-) I did check the simmetrics project and it's in C#
and reimplementing the interfaces and all required tokenizer libraries
is too much effort for now.

I want something fast yet simple like
http://en.wikipedia.org/wiki/Bitap_algorithm - that one uses Levenshtein
distance function

Thank you for the quick reply and for a reminder of simmetrics. If there
is no other alternative I'll try porting it to C and post the link to the list
so if anyone needs that as well it'll be available

thanks,
Mateusz Berezecki
_______________________________________________
Dailydave mailing list
Dailydavelists.immunitysec.com
http://lists.immunitysec.com/mailman/listinfo/dailydave