Projects / MAWK / Comments

Comments for MAWK

09 Sep 2009 00:02 jmellander

As a regular mawk user, I use it for lots of data slicing & dicing - I noticed that when the hash table become enormous (millions of entries), that the performance is very slow - I surmised that the hash function was having lots of collisions, thus made some changes to a modern hash function while I was trapped in a slow meeting.

In hash.c I replaced the 'hash' function with:

/*
FNV-1 hash function,
per http://en.wikipedia.org/wiki/Fowler-Noll-Vo_hash_function
*/
unsigned
hash(s)
register char *s ;
{
register unsigned h = 2166136261 ;

while (*s) h = (h * 16777619) ^ *s++ ;
return h ;
}

and in array.c replaced 'ahash' with:
/*
FNV-1 hash function,
per http://en.wikipedia.org/wiki/Fowler-Noll-Vo_hash_function
*/
static unsigned ahash(sval)
STRING* sval ;
{
register unsigned h = 2166136261 ;
register char *s = sval->str;

while (*s) h = (h * 16777619) ^ *s++ ;
return h ;
}

Will send benchmark results later, when I run it on an unloaded system.

Screenshot

Project Spotlight

ReciJournal

An open, cross-platform journaling program.

Screenshot

Project Spotlight

Veusz

A scientific plotting package.