I think we're overlooking a possibility that would combine the effectiveness of rule-based and Bayes filtering. Words and phrases need not be the only Bayes tokens scored. Any property of a message, i.e. the result of evaluating any rule, can be a Bayes token. POPFile implements this to some degree with "pseudowords (http://popfile.sourceforge.net/cgi-bin/wiki.pl?Glossary/PseudoWord)", but I would really like to see what would happen if a large ruleset like SpamAssassin's were fully tokenized. Forget about manual scoring -- why not let Bayes figure it out?