Projects / Apache SpamAssassin

Apache SpamAssassin

Apache SpamAssassin is an extensible email filter that is used to identify spam. Once identified, the mail can then be optionally tagged as spam for later filtering. It provides a command line tool to perform filtering, a client-server system to filter large volumes of mail, and Mail::SpamAssassin, a set of Perl modules allowing Apache SpamAssassin to be used in a wide variety of email systems.

Tags
Licenses
Operating Systems
Implementation

Recent releases

  •  19 Mar 2010 16:34

    Release Notes: Spamhaus DBL was added as a URIBL_DBL_SPAM rule. The ImageInfo plugin was updated to the latest release. The RCVD_IN_CSS rule was fixed. 2tld and 3tld sub-domain hosters were listed for URIBL/SURBL/DBL queries. Other fixes were made.

    •  26 Jan 2010 13:24

      Release Notes: Rules were updated and new scores are assigned by GA. Rules are no longer in the package, but installed by sa-update. A new function can() allows testing for capabilities offered. Support for IPv6 was greatly improved. When the time limit is exceeded, partial results are still returned. The timing report is now logged or offered as a tag. A caller may supply out-of-band data. Detection of URI was rewritten. The DKIM plugin now supports multiple signatures and ADSP with overrides. The FreeMail, PhishTag, and Reuse plugins were added. Error detection, handling, and reporting were improved to facilitate troubleshooting.

      •  12 Jun 2008 14:03

        Release Notes: Newer gpg versions require keys to be cross-certified, so the sa-update public key was fixed accordingly. A perl version string was added to the storage area for compiled rulesets, to avoid crashes when perl is upgraded between major versions (e.g. perl 5.8.x to 5.10.0) and the ABI breaks. Some FORGED_MUA_OUTLOOK false positives were cleared on the new-format Message-ID generated by the Outlook Express version used in Windows XP service pack 3. Compatibility with Postgres 8.1.0 and later was fixed. Other miscellaneous fixes were done.

        •  07 Jan 2008 19:59

          Release Notes: Major sa-compile fixes. Minor fixes in other departments. 'score set for a non-existent rule' has been made a debug message, instead of a lint warning, since it's a very frequent FAQ.

          •  09 Aug 2007 20:48

            Release Notes: The new setuid code has been fixed to work with Perl 5.6.1 and to support DCC and Pyzor in all releases of Perl. The default 'user_scores_ldap_username' is now the null string, allowing anonymous binding. A 'schema' syntax error in LDAP config support has been fixed, along with an error where zeroing an 'eval' rule's score did not stop it from running. The new message ID format seen from Vista or Windows 2003 Server MAPI is now allowed to avoid false positives, and several issues with RDNS_DYNAMIC have been fixed.

            Recent comments

            21 Jan 2004 03:07 crippler

            some of this discussion is outdated
            SpamAssassin has come a long way since this discussion started. The concept of whitelisting & blacklisting messages has gotten a whole lot easier.

            Now that SA has Bayesian filters, training your SA with a large corpus of mail can be pretty easy (though necessarily tedious). I've set up two folders; one called "Ham" and the other called "Spam". Go through and move messages in your mailbox to one of those two folders. Good mail goes to Ham, junkmail goes to Spam. The larger the corpus of mail you pull from the better.

            Next I set up a couple of cron jobs that look like:

            20 2 * * * /usr/bin/sa-learn --ham --mbox ~/Ham
            20 4 * * * /usr/bin/sa-learn --spam --mbox ~/Spam

            Once a day, SpamAssassin goes through my Ham & Spam folders and learns what good & bad mail tend to look like. The more I feed it, the better it gets at catching it.

            Some types of spam were still getting through despite this filtering. The scores were significant but below the minimum score I had set to mark a message as spam. Many of these were Nigerian scam mails. Here are some lines I added to my global SpamAssassin config to take a big chunk out of incoming Spam:

            # New blacklist not included in
            # default configuration
            header RCVD_IN_BNBL eval:check_rbl('bl', 'bl.blueshore.net.')
            describe RCVD_IN_BNBL Listed by BNBL
            tflags RCVD_IN_BNBL net
            score RCVD_IN_BNBL 2
            # Higher scoring for Nigerian scams
            score NIGERIAN_BODY1 3
            score NIGERIAN_BODY2 2
            # Known high-volume spammers that
            # I have no interest in hearing from.
            body PHARMAWHAREHOUSE /pharmawharehouse.biz/
            describe PHARMAWHAREHOUSE Link to pharmawharehouse.biz
            body PHARMACOURT /pharmacourt.biz/
            describe PHARMACOURT Link to pharmacourt.biz
            body VALUEPOINTMEDS /valuepointmeds.biz/
            describe VALUEPOINTMEDS Link to valuepointmeds.biz
            score PHARMAWHAREHOUSE 10
            score PHARMACOURT 10
            score VALUEPOINTMEDS 10

            08 May 2003 03:22 weissel

            Re: Almost Amazing!

            > > Won't work well --- if at all --- for
            > >
            > > * Mailing lists
            > > * automated mailings (freshmeat's new version mailings, most
            > > buying over the internet stuff, Bounces, etc.)
            %
            > Legitimate mailing lists and automated mailings are usually
            > easy to differentiate from spam;

            I got a 'please go to this website' (where you have to enter a
            20 char long string to let the message pass through) ... which
            looked so much like the spam I usually get that the spam filter
            treated as spam.

            In the end I had to re-write the message, before it passed
            through, as it timed out the first time before I looked through
            the spam heap. I would not have done this if the email had not
            been important _for me_ to arrive. Helping others is _not_ that
            important, as I do this on my free time.

            If I had countered with a confirmation request instead of
            throwing it on the spam heap, I'd never known that my mail never
            made it. Instead I would have grumbled over the recipient's
            silence.

            Easy to differentiate, indeed.


            > also, if you know ahead of time that you are subscribing to
            > something, you can add it to a whitelist.

            So I just got a mail from a guy 'noreply@freshmeat.net' which
            notified me of your answer. Never got that mail before. So how
            can I whitelist that in advance? How is noreply@freshmeat.net
            gonna read, much less respond to a confirmation request?

            How is _that_ low maintenance?

            (The same goes, as I said, for many online shopping cases.)


            > > * people who don't like jumping through hoops to get mail
            > > through (unfortunately these are usually the people who
            > > give answers).
            %
            > First, you can safely whitelist everybody you send to, so as
            > not to inconvenience them.

            i.e. even more work for me to integrate that into my mail client.

            And if they answer me from a different (e.g. preferred or new)
            address, they'll be inconvenienced again --- when all they try to
            do is making me reach them better/faster.

            This can be real fun if you use sneakemail.com (I do).
            If you send me a mail to my sneakemail address (say
            xxxxx@sneakemail.com), I get a temporary yyyyy@sneakemail.com
            (which will expire in a few days).

            You send me another mail in a week ... and I'll get a
            yzyzyz@sneakemail.com. A new confirmation is clearly neccessary,
            right? So you'll have to parse the X-Sneakemail-From: header
            instead of just the From header, where it applies.


            > Also, if you apply this, say, only to messages tagged by
            > spamassassin as 'probable spam', only your friends trying to
            > sell you penis enlargements will be asked to confirm :-)

            So we are still stuck on the case --- which I, personally,
            experienced --- where a confirm mail will be asked to confirm
            itself. At best, you'll never ever see that mail. Really a good
            thing if the mail was somewhat important.


            > > * Senders where the anti-spam system fires such a message
            > > right back to you --- you can get a nice mail flood if that
            > > goes over a mailing list. For 3 parties you'll get a very
            > > very impressive snowball effect! (Can you say 'complete
            > > meltdown'?)
            %
            > Oh, come on now. Sending one message per address is a simple
            > thing to do.

            You are implying a world where nobody's 'out of office' mails
            will be send as answer to their own 'out of office' mails.

            Welcome to reality.

            I have seen that at 100 mails/hour on a mailing list. More than
            once. So much that the mailing list finally stopped Reply-To
            munging. It won't help, either, if the sender address keeps
            changing. Like some peope who regularly change their mail
            addresses to avoid spam.


            > To see two systems that are successful with the confirmation
            > technique, read up on these: TMDA and Active Spam Killer.
            > Remember that you can combine this with a spam identifier like
            > spamassassin to only request confirmation from messages that
            > look like spam.

            So you'll be part of a DDoS on some poor schmuck who's address
            was faked into the mail.

            If but 0.5% of the recipients of a modest 5 mio. spam use such
            a thing, you'll have 25k mails on you on the day your address
            appears in the From of a spam. And often enough it is somebody's
            spam. Ask the owners of test.com. With luck, you'll fire off
            another 25k mails if the confirmation request includes the
            original spam "for your convenience".

            And now imagine 1% and 20 million recipients. 200k mails is fun
            and a half.

            Again, it's your choice, I believe that these things can
            harm others, badly, and thus should not be used without deep
            understanding. But go right ahead, time will show if DDoSsing
            innocent bystanders will help the fight against spam.

            07 May 2003 12:15 markthomas

            Re: Almost Amazing!

            > % How about this: set up an
            > autoresponder
            > % that says, "I'm sorry, your message
            > has
            > % been trapped by my spam filter. If
            > this
            > % is a legitimate email message, please
            > % put the word PASSWORD in the subject.
            > %[...]
            > %
            > % I guarantee that spammers are not
            > going
            > % to bother putting your password in
            > the
            > % subject.
            >
            >
            > Won't work well --- if at all --- for
            > * Mailing lists
            > * automated mailings (freshmeat's new
            > version mailings, most buying over the
            > internet stuff, Bounces, etc.)


            Legitimate mailing lists and automated mailings are usually easy to differentiate from spam; also, if you know ahead of time that you are subscribing to something, you can add it to a whitelist.


            > * people who don't like jumping through
            > hoops to get mail through (unfortunately
            > these are usually the people who give
            > answers).


            First, you can safely whitelist everybody you send to, so as not to inconvenience them.
            Also, if you apply this, say, only to messages tagged by spamassassin as 'probable spam', only your friends trying to sell you penis enlargements will be asked to confirm :-)


            > * Senders where the anti-spam system
            > fires such a message right back to you
            > --- you can get a nice mail flood if
            > that goes over a mailing list. For 3
            > parties you'll get a very very
            > impressive snowball effect! (Can you
            > say 'complete meltdown'?)


            Oh, come on now. Sending one message per address is a simple thing to do.


            > * If a mailing list rewrites the header
            > enough (reply-to munging comes to mind)
            > you could even start answering your own
            > "put PASSWORD in subject line" for all
            > the mailing list to see. Fun (and that
            > has happened with vacation mails before,
            > at 100 mails/h)!
            >
            > You will have to decide yourself if
            > these restrictions and dangers are
            > acceptable to you, your mailing list
            > reputation and your environment; you
            > also have to think about how to avoid
            > vicious circles as outlines above.
            > Dropping mails you'll always risk
            > dropping information, if that risk is
            > acceptable to you, go ahead.


            To see two systems that are successful with the confirmation technique, read up on these: TMDA (http://www.tmda.net/) and Active Spam Killer (http://sourceforge.net/projects/a-s-k). Remember that you can combine this with a spam identifier like spamassassin to only request confirmation from messages that look like spam.

            20 Mar 2003 06:49 weissel

            Re: Almost Amazing!

            %
            > % % How about this: set up an autoresponder
            > % % that says, "I'm sorry, your message has
            > % % been trapped by my spam filter. If this
            > % % is a legitimate email message, please
            > % % put the word PASSWORD in the subject.
            > % % [...]
            %
            > % % I guarantee that spammers are not going
            > % % to bother putting your password in the
            > % % subject.
            %
            %
            [shortened]
            > % Won't work well --- if at all --- for
            > % * Mailing lists
            > % * automated mailings
            > % * people who don't like jumping through
            > % hoops
            > % * Senders where the anti-spam system
            > % fires such a message right back to you
            > % * If a mailing list rewrites the header
            > % enough
            [leading to endless mail loops and other fun things]
            %
            > % You will have to decide yourself if
            > % these restrictions and dangers are
            > % acceptable to you, your mailing list
            > % reputation and your environment;
            [...]
            %
            > Most of what you are asking for can be resolved
            > using the user_prefs file. You can find
            > a free Windows utility for creating and
            > editing user_prefs files here:
            %
            > http://www.CleanMyMailbox.com/sa


            As a non-Windows-User I cannot use that program (not that I'd need it).

            Also, there is no way the user_prefs file can prevent the problems outlined above if you use an autoresponder telling people to put something specific into the subject.

            19 Mar 2003 22:21 jhalbrook

            Re: Almost Amazing!

            >
            > % How about this: set up an
            > autoresponder
            > % that says, "I'm sorry, your message
            > has
            > % been trapped by my spam filter. If
            > this
            > % is a legitimate email message, please
            > % put the word PASSWORD in the subject.
            > %[...]
            > %
            > % I guarantee that spammers are not
            > going
            > % to bother putting your password in
            > the
            > % subject.
            >
            >
            > Won't work well --- if at all --- for
            > * Mailing lists
            > * automated mailings (freshmeat's new
            > version mailings, most buying over the
            > internet stuff, Bounces, etc.)
            > * people who don't like jumping through
            > hoops to get mail through (unfortunately
            > these are usually the people who give
            > answers).
            > * Senders where the anti-spam system
            > fires such a message right back to you
            > --- you can get a nice mail flood if
            > that goes over a mailing list. For 3
            > parties you'll get a very very
            > impressive snowball effect! (Can you
            > say 'complete meltdown'?)
            > * If a mailing list rewrites the header
            > enough (reply-to munging comes to mind)
            > you could even start answering your own
            > "put PASSWORD in subject line" for all
            > the mailing list to see. Fun (and that
            > has happened with vacation mails before,
            > at 100 mails/h)!
            >
            > You will have to decide yourself if
            > these restrictions and dangers are
            > acceptable to you, your mailing list
            > reputation and your environment; you
            > also have to think about how to avoid
            > vicious circles as outlines above.
            > Dropping mails you'll always risk
            > dropping information, if that risk is
            > acceptable to you, go ahead.


            Most of what you are asking for can be resolved

            using the user_prefs file. You can find a free
            Windows utility for creating and editing user_prefs files here:

            http://www.CleanMyMailbox.com/sa

            Screenshot

            Project Spotlight

            OpenStack4j

            A Fluent OpenStack client API for Java.

            Screenshot

            Project Spotlight

            TurnKey TWiki Appliance

            A TWiki appliance that is easy to use and lightweight.