-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
David Morton wrote:
> Craig Thompson wrote:
>> I have designated an account as a spam-trap
account, but I can't find
>> any documentation on setting it up. My goal in
using a spam-trap
>> account is to turn off the non-spam cache.
>
>> I am forwarding spam emails to the spam-trap
account that Maia allowed
>> through. The system is still trying to forward the
email to an account,
>> so I am getting bounces. I was under the
impression that Maia would not
>> forward these emails. When I look at statistics
for the account, all
>> emails have been confirmed as spam, which is good.
>
>
> That's not how spamtraps work; a spamtrap address is
one that receives only spam
> directly from the spammers. You cannot forward messages
to it, otherwise it will
> think you are the spammer.
Indeed, I think you misunderstand the purpose of a
spam-trap. A
spam-trap is an account that receives no legitimate mail--it
must use an
email address that has never been used for any legitimate
purpose, nor
ever been advertised in any way that would invite people to
send
legitimate mail to it. It may not even have been advertised
at all;
thanks to dictionary attacks, spammers doing address probes
will
discover addresses like "asdfghjk example.com"
eventually without any
help, as long as that address appears to accept mail.
The point feature of a spam-trap, though, is that by its
careful design,
everything it receives is by definition
"unsolicited," so there's no
need to do any analysis of the mail--not for viruses, not
spam, not for
banned attachments or bad headers. Everything it receives
is
automatically classified as "confirmed spam," so
no additional resources
are required to deal with it.
Spam-traps are useful mainly as a means of gathering spam
samples for
Bayes training and reporting to collaborative networks like
Razor,
Pyzor, DCC, and SpamCop. Spam samples gathered this way are
generally
considered to be "more provably" spam, given the
design of the spam-trap
address mechanism, so such evidence is more likely to be
accepted with
confidence by DNSBLs and other authorities.
As David points out, spam-traps are /not/
"spam-reporting" addresses,
and should never be used that way. The main problem is that
mail
forwarding/redirecting modifies the mail headers in the
process, so the
evidence that gets submitted for Bayes training and spam
reporting is
tainted with information from the forwarder. Eventually
your Bayes
database starts recognizing the forwarders' mail as spam,
and if you've
been reporting this stuff to Razor/Pyzor/DCC/SpamCop, others
around the
world will eventually start doing the same.
Using a "spam-reporting" address for reporting
false negatives is a
tricky business. To do it properly, you need to ensure that
the
original email is not modified in any way. Encapsulating it
as an
attachment is one way to do this, but it of course requires
that a
process at the receiving end know to unpack the attachment.
This also
requires that all of your submitters know to do the
encapsulating
properly, and that they never forget to do so.
In fact, it was this very problem with
"spam-reporting" addresses that
motivated us to devise the non-spam cache mechanism for Maia
for
reporting false positives (i.e. the mechanism you're so
eager to get rid
of). Since Maia stores a pristine copy of the email in its
database,
there's no chance that the headers will get munged during
the
learning/reporting phase. It's also simpler for end-users
to use than
having to consistently encapsulate their spam as attachments
(and of
course for you to write an attachment unpacker at the
receiving end).
Admittedly, it's a new concept to get through to users, many
of whom are
initially a bit confused about what the non-spam cache is
for, but in
the end I think it's a much safer and more reliable way to
report false
negatives.
There's another reason to use the non-spam cache, though:
Bayes
training. Even if you perfected a
"spam-reporting" address mechanism,
you'd still be doing yourself a disservice by denying your
Bayes
database an opportunity to learn from user-confirmed
non-spam. The
Bayes engine works best (and fastest) when it gets feedback
from users
to tell it not only when it has made mistakes, but also when
it has
guessed correctly--that's what the "confirmation"
process is all about.
When you hit the "confirm" button in the spam
quarantine or the
non-spam cache, you're not just rescuing false positives and
reporting
false negatives, you're also telling SpamAssassin that it
got everything
else /right/. That increases SpamAssassin's confidence in
its judgment
about the tokens in those items, so it learns faster and
more
effectively. If you disable the non-spam cache, then, it
will never get
that kind of confirmation about non-spam items, so most of
what it
learns will be about spam, and its ability to discriminate
one from the
other will suffer.
- --
Robert LeBlanc <rjl renaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamail
guard.com/>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFGH/RlGmqOER2NHewRArxdAJ0cJHdbsE2LaUndoNmEobKd13z4PQCc
Dqvc
XhKBe5RxVxui5ECFSLMfTnI=
=BPui
-----END PGP SIGNATURE-----
_______________________________________________
Maia-users mailing list
Maia-users renaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
|