|
List Info
Thread: PDF spam solutions
|
|
| PDF spam solutions |
  Canada |
2007-08-13 16:41:19 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Like the rest of you, I'm sure, I've been receiving a glut
of PDF spam
lately, and I've been experimenting with various tactics for
curbing the
onslaught. Some tactics work better than others, naturally,
so I
thought I'd share my results here.
(1) SpamAssassin core rules
To deal with PDF spam, the SpamAssassin developers added a
new core rule
called TVD_PDF_FINGER01, which identifies emails that have
empty bodies
but contain PDF attachments. It works well, but its default
score of
1.0 is too low to make it the only tool for the job.
Increasing the
score isn't really a good idea, though, since a lot of
business users
regularly send PDF attachments with empty mail bodies, and
this could
lead to false positives in a hurry.
You can certainly get this new rule for any version of
SpamAssassin
(newer than 3.1.1) using sa-update, but now that the 3.2.x
series
appears to have stabilized I'd also recommend that you
upgrade to 3.2.3
to take advantage of the latest rulesets.
(2) PDFInfo plugin
Available from <http://w
ww.rulesemporium.com/plugins.htm>, this plugin
is a step better in that it tries to identify specific PDF
spams by
their characteristics--image dimensions, number of images in
the file,
image-to-text ratio, filename, and meta-information (e.g.
author,
creator, creation/modified date, etc.), as well as fuzzy
hashes of the
file itself.
The downside is that it's /too/ specific, and that requires
you to
download new versions of the pdfinfo.cf file whenever new
signatures are
added, because every new signature is a new rule. This
makes the plugin
very nice for catching PDF spam that's already circulating,
but it's not
effective at catching new variants, and updating it is
awkward.
(3) PDFText plugin
The PDFText plugin uses the pdftotxt and pdfinfo utilities
from the xpdf
package to try to extract the text and meta-information from
PDF files,
so that they can then be subjected to pattern-based tests
for spammy
content. Two versions are currently available:
For SpamAssassin 3.1.x:
<http://www.mail-arc
hive.com/users spamassassin.apache.org/msg45465.html>
For SpamAssassin 3.2.x:
<http://www.mail-arc
hive.com/users spamassassin.apache.org/msg45494.html>
Unfortunately this plugin is still a very early
alpha--proof-of-concept,
really--and needs a considerable amount of polishing before
it could
really be recommended for production use. It also relies on
its own
wordlist for scoring, rather than making the discovered text
available
to the full battery of SpamAssassin rules, but the author is
apparently
working on that, along with experimental support for using
GOCR to scan
the images in PDF files.
(4) FuzzyOCR plugin
There's been some discussion about FuzzyOCR's potential role
in catching
PDF spam--at least the PDF spam that incorporates images.
The plugin's
author is reluctant at best: "actually, I will not try
to scan PDFs, the
risk of false positives is too high and PDFs do not have a
future for
spammers (in my opinion) as most clients do not display them
directly.
Sending PDFs is only a desperate try of spammers to
circumvent image
scanners, but I don't think this will be the new
"trend", neither do I
think that this kind of spam has any future or success, like
image spam
has."
That said, he seems to have relented under the pressure, and
some basic
support for this was added recently to the svn version with
a lot of
disclaimers ("highly experimental and disabled by
default", "Enable this
at your own risk, this might lead to false positives and
classify
important documents as spam. YOU HAVE BEEN WARNED.").
Since you need to be using the svn version of FuzzyOCR if
you're running
SpamAssassin 3.2.x anyway, you may wish to experiment with
the
PDF-scanning support, since it won't cost you any resources
you aren't
already spending. If you're /not/ using FuzzyOCR, though, I
wouldn't
advise installing it just to solve the PDF spam problem.
(5) Custom rules
Eric A. Hall posted a custom ruleset recently to the
SpamAssassin-Users
list that uses the AWL to determine whether the sender of a
binary
attachment (major MIME-type of application, image, audio,
video, or
model) has sent the recipient mail before. If this is the
first email
the recipient has ever received from this sender, and it
contains such
an attachment, it gets penalized accordingly for coming from
a stranger.
You need to have the MIMEHeader plugin installed, but this
is included
by default in the newer SpamAssassin 3.2.x series. The
ruleset can be
added easily to your local.cf file:
ifplugin Mail::SpamAssassin::Plugin::MIMEHeader
mimeheader __L_C_TYPE_APP Content-Type =~
/^application/i
mimeheader __L_C_TYPE_IMAGE Content-Type =~ /^image/i
mimeheader __L_C_TYPE_AUDIO Content-Type =~ /^audio/i
mimeheader __L_C_TYPE_VIDEO Content-Type =~ /^video/i
mimeheader __L_C_TYPE_MODEL Content-Type =~ /^model/i
meta L_STRANGER_APP (!AWL &&
__L_C_TYPE_APP)
score L_STRANGER_APP 1.0
tflags L_STRANGER_APP noautolearn
priority L_STRANGER_APP 1001 # defer till after AWL
describe L_STRANGER_APP Application file sent by a
stranger
meta L_STRANGER_IMAGE (!AWL &&
__L_C_TYPE_IMAGE)
score L_STRANGER_IMAGE 1.0
tflags L_STRANGER_IMAGE noautolearn
priority L_STRANGER_IMAGE 1001 # defer till after AWL
describe L_STRANGER_IMAGE Image file sent by a
stranger
meta L_STRANGER_AUDIO (!AWL &&
__L_C_TYPE_AUDIO)
score L_STRANGER_AUDIO 1.0
tflags L_STRANGER_AUDIO noautolearn
priority L_STRANGER_AUDIO 1001 # defer till after AWL
describe L_STRANGER_AUDIO Audio file sent by a
stranger
meta L_STRANGER_VIDEO (!AWL &&
__L_C_TYPE_VIDEO)
score L_STRANGER_VIDEO 1.0
tflags L_STRANGER_VIDEO noautolearn
priority L_STRANGER_VIDEO 1001 # defer till after AWL
describe L_STRANGER_VIDEO Video file sent by a
stranger
meta L_STRANGER_MODEL (!AWL &&
__L_C_TYPE_MODEL)
score L_STRANGER_MODEL 1.0
tflags L_STRANGER_MODEL noautolearn
priority L_STRANGER_MODEL 1001 # defer till after AWL
describe L_STRANGER_MODEL Model file sent by a
stranger
endif
(6) SaneSecurity signatures
If you use ClamAV (you do, don't you?), another option is to
use the
phishing and scam signatures published by SaneSecurity
<http://www
.sanesecurity.co.uk/clamav/>. These signatures are
updated
multiple times a day, and include a lot of PDF spam, making
it perhaps
the most responsive solution available at the moment.
These phishing/scam emails get caught by ClamAV rather than
SpamAssassin, so they show up in Maia's
"Viruses/Malware" quarantine
instead of the spam quarantine, which is a bit annoying, but
that's
something I'll be working to address in future versions.
I can't argue with the effectiveness of SaneSecurity's
signatures,
though--they are by far the most effective blockers of PDF
spam that
I've found, and I would strongly recommend that you use
them.
(7) Other plugins
While rules and plugins that target PDF spam specifically
are very
useful, it's worth noting that the bulk of the PDF spam
comes from
botnets, so adding the Botnet plugin
<http:
//people.ucsc.edu/~jrudd/spamassassin/> can catch a
lot of these
things on its own, and it provides a nice score supplement
to go along
with the PDF-specific rules. The latest version is 0.8, and
it just
needs one small patch (courtesy of Mark Martinec):
- --- Botnet.pm.orig Mon Aug 6 15:59:16 2007
+++ Botnet.pm Mon Aug 6 16:02:43 2007
 -711,5
+711,14 
(defined $max) &&
($max =~ /^-?d+$/) ) {
- - $resolver = Net: NS::Reso
lver->new();
+ $resolver = Net: NS::Reso
lver->new(
+ udp_timeout => 5,
+ tcp_timeout => 5,
+ retrans => 0,
+ retry => 1,
+ persistent_tcp => 0,
+ persistent_udp => 0,
+ dnsrch => 0,
+ defnames => 0,
+ );
if ($query = $resolver->search($name, $type)) {
# found matches
 -834,5
+843,14 
my ($ip) = _;
my ($query, answer, $rr);
- - my $resolver = Net: NS::Reso
lver->new();
+ my $resolver = Net: NS::Reso
lver->new(
+ udp_timeout => 5,
+ tcp_timeout => 5,
+ retrans => 0,
+ retry => 1,
+ persistent_tcp => 0,
+ persistent_udp => 0,
+ dnsrch => 0,
+ defnames => 0,
+ );
my $name = "";
- --
Robert LeBlanc <rjl renaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamail
guard.com/>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
iD8DBQFGwM//GmqOER2NHewRAhqDAKCRY5U7T4hgl3yj928ajM8KuceI2wCf
YESS
25zC3NMEDVmcUaEJw9En4A8=
=zjNR
-----END PGP SIGNATURE-----
_______________________________________________
Maia-users mailing list
Maia-users renaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
|
|
| Re: PDF spam solutions |
  United Kingdom |
2007-08-14 18:23:14 |
Hi Robert
>From the patch available at the following URL
http://theinternet.org.uk/downloads/spamshifter-maia1
02.txt
+# Imported from Amavisd-new 2.5 the ability to make it
possible for a virus
+# scanner to derate an infection report to a spam report
with some highly
+# ragged bits (if's and dodgy vars) that need working on!!
I have tested and it works on our devel box, however my poor
excuse for perl
programming and numerous interruptions means that it will
not be up to your
standard, so you will want to check it over more than once.
Esp the bit marked
# - MH Domain for recipient can't remember why this is here!
but you might
not need it.
If I get time, I will clean it up and redo the lazy parts.
Cheers
Mike
Robert LeBlanc wrote:
> Like the rest of you, I'm sure, I've been receiving a
glut of PDF spam
> lately, and I've been experimenting with various
tactics for curbing the
> onslaught. Some tactics work better than others,
naturally, so I
> thought I'd share my results here.
>
>
Snip
>
> (6) SaneSecurity signatures
>
> If you use ClamAV (you do, don't you?), another option
is to use the
> phishing and scam signatures published by SaneSecurity
> <http://www
.sanesecurity.co.uk/clamav/>. These signatures are
updated
> multiple times a day, and include a lot of PDF spam,
making it perhaps
> the most responsive solution available at the moment.
>
> These phishing/scam emails get caught by ClamAV rather
than
> SpamAssassin, so they show up in Maia's
"Viruses/Malware" quarantine
> instead of the spam quarantine, which is a bit
annoying, but that's
> something I'll be working to address in future
versions.
>
> I can't argue with the effectiveness of SaneSecurity's
signatures,
> though--they are by far the most effective blockers of
PDF spam that
> I've found, and I would strongly recommend that you use
them.
>
>
> (7) Other plugins
>
> While rules and plugins that target PDF spam
specifically are very
> useful, it's worth noting that the bulk of the PDF spam
comes from
> botnets, so adding the Botnet plugin
> <http:
//people.ucsc.edu/~jrudd/spamassassin/> can catch a
lot of these
> things on its own, and it provides a nice score
supplement to go along
> with the PDF-specific rules. The latest version is
0.8, and it just
> needs one small patch (courtesy of Mark Martinec):
>
> --- Botnet.pm.orig Mon Aug 6 15:59:16 2007
> +++ Botnet.pm Mon Aug 6 16:02:43 2007
>  -711,5 +711,14 
> (defined $max) &&
> ($max =~ /^-?d+$/) ) {
> - $resolver = Net: NS::Reso
lver->new();
> + $resolver = Net: NS::Reso
lver->new(
> + udp_timeout => 5,
> + tcp_timeout => 5,
> + retrans => 0,
> + retry => 1,
> + persistent_tcp => 0,
> + persistent_udp => 0,
> + dnsrch => 0,
> + defnames => 0,
> + );
> if ($query = $resolver->search($name, $type))
{
> # found matches
>  -834,5 +843,14 
> my ($ip) = _;
> my ($query, answer, $rr);
> - my $resolver = Net: NS::Reso
lver->new();
> + my $resolver = Net: NS::Reso
lver->new(
> + udp_timeout => 5,
> + tcp_timeout => 5,
> + retrans => 0,
> + retry => 1,
> + persistent_tcp => 0,
> + persistent_udp => 0,
> + dnsrch => 0,
> + defnames => 0,
> + );
> my $name = "";
>
>
_______________________________________________
Maia-users mailing list
Maia-users renaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
_______________________________________________
Maia-users mailing list
Maia-users renaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
|
|
| Re: PDF spam solutions |
  Germany |
2007-08-18 03:42:42 |
Robert LeBlanc schrieb:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Like the rest of you, I'm sure, I've been receiving a
glut of PDF spam
> lately, and I've been experimenting with various
tactics for curbing the
> onslaught. Some tactics work better than others,
naturally, so I
> thought I'd share my results here.
>
>
Robert,
I couldn't help noticing that the iXhash plugin I wrote and
the
corresponding lists the plugin can use work quite well on
this type of spam.
I guess other content-hashing mechanisms (pyzor,razor etc.)
will perform
similar.
You might want to give it a try. See http://ixhash.sf.net, and
sorry for
the shameless advertising
Dirk
_______________________________________________
Maia-users mailing list
Maia-users renaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
|
|
| Re: PDF spam solutions |
  Canada |
2007-08-20 04:10:19 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Dirk Bonengel wrote:
> I couldn't help noticing that the iXhash plugin I wrote
and the
> corresponding lists the plugin can use work quite well
on this type of spam.
> I guess other content-hashing mechanisms (pyzor,razor
etc.) will perform
> similar.
>
> You might want to give it a try. See http://ixhash.sf.net, and
sorry for
> the shameless advertising
Thanks Dirk! Yes, I've been using your iXhash plugin for a
few weeks
now, and it's definitely triggering (703 times/day on
average). I
didn't mention it in my list of PDF spam solutions for the
same reason
that I didn't mention Razor, Pyzor, or DCC--these are not
PDF-specific
solutions, they are general-purpose spam solutions.
On that note, though, I have a question for you. Will there
be a spam
reporting mechanism for iXhash, as there is for
Razor/Pyzor/DCC? At
present it appears to be a read-only database, but if you do
eventually
introduce a reporting mechanism I'd like to provide support
for it in
the process-quarantine.pl script.
- --
Robert LeBlanc <rjl renaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamail
guard.com/>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
iD8DBQFGyVp7GmqOER2NHewRAu1DAJ97SyHmBQbm8Zxn+ySSgLLxhEGWyACd
EzvD
g5HzasFXWbSig4AaHbML+NM=
=DjAO
-----END PGP SIGNATURE-----
_______________________________________________
Maia-users mailing list
Maia-users renaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
|
|
[1-4]
|
|