List Info

Thread: Reporting takes forever...




Reporting takes forever...
user name
2006-07-16 08:39:45
Hello,

I was looking at our mail system, and discover it takes it
forever to report+learn mails... I know it's not a big
system,
but it works otherwise pretty good (bi p3-750, 1G ram)

amavisnw2 ~ $ time /usr/bin/process-quarantine.pl --learn
--report --debug info --limit=300
2006-07-16 10:27:14 Maia: [process-quarantine] Removing
stale lock file [22249]...
2006-07-16 10:27:16 Maia: [process-quarantine-sub] Warning:
enforcing maximum limit of 127 spam/non-spam items per run.
2006-07-16 10:27:16 Maia: [process-quarantine-sub] Starting
2006-07-16 10:27:16 Maia: [process-quarantine-sub] limit =
127 items per run
2006-07-16 10:27:16 Maia: [process-quarantine-sub] max-size
= 262144 bytes
2006-07-16 10:27:16 Maia: [process-quarantine-sub] learn =
enabled
2006-07-16 10:27:16 Maia: [process-quarantine-sub] report =
enabled
2006-07-16 10:27:16 Maia: [process-quarantine-sub] report to
Razor = enabled
2006-07-16 10:27:16 Maia: [process-quarantine-sub] report to
Pyzor = enabled
2006-07-16 10:27:16 Maia: [process-quarantine-sub] report to
DCC = enabled
2006-07-16 10:27:16 Maia: [process-quarantine-sub] report to
SpamCop = enabled
2006-07-16 10:27:16 Maia: [process-quarantine-sub] spam-only
= disabled
2006-07-16 10:27:16 Maia: [process-quarantine-sub] ham-only
= disabled
2006-07-16 10:27:16 Maia: [process-quarantine-sub] debug =
enabled (info)
2006-07-16 10:27:26 Maia: [process-quarantine-sub] Mail item
is encrypted.

First, I don't quite get the 127 limit per run, but it
doesn't
seem to impact on performance anyway. It just takes forever
to report a single mail... have a look at this:

2006-07-16 10:29:44 Maia: [process-quarantine-sub] Mail item
is encrypted.
2006-07-16 10:34:49 Maia: [process-quarantine-sub] Learned
mail item 390960 as spam and reported it
[22296] info: reporter: spam reported to DCC
[22296] info: reporter: spam reported to Pyzor
[22296] info: reporter: spam reported to Razor
[22296] info: reporter: spam reported to SpamCop
2006-07-16 10:35:45 Maia: [process-quarantine-sub] Learned
mail item 390960 as spam and reported it
2006-07-16 10:35:45 Maia: [process-quarantine-sub] Deleted
spam/non-spam recipient references to mail item 390960
2006-07-16 10:35:45 Maia: [process-quarantine-sub] Deleted
mail item 390960
2006-07-16 10:35:45 Maia: [process-quarantine-sub] TIMING
[total 360429 ms] - decrypt-msg: 267 (0%), msg-prep: 8 (0%),
send-report: 360136 (100%), delete-mail: 19 (0%), rundown: 0
(0%)

6 minutes !! And system load up to >2...

I don't see where the culprit lies -- except that it's
learning
that seems to take forever. Spamassassin over MySQL/InnoDB.
Any
idea, hint on how to ameliorate this ?

	Arnaud.
-- 
Perso: http://launay.org/blog/
Hébergement: http://www.nocworld.com/
_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
Reporting takes forever...
user name
2006-07-16 10:47:33
Arnaud Launay wrote:

> I was looking at our mail system, and discover it takes
it
> forever to report+learn mails... I know it's not a big
system,
> but it works otherwise pretty good (bi p3-750, 1G ram)
> 
> amavisnw2 ~ $ time /usr/bin/process-quarantine.pl --learn
--report --debug info --limit=300
> 2006-07-16 10:27:14 Maia: [process-quarantine] Removing
stale lock file [22249]...
> 2006-07-16 10:27:16 Maia: [process-quarantine-sub]
Warning: enforcing maximum limit of 127 spam/non-spam items
per run.
> 2006-07-16 10:27:16 Maia: [process-quarantine-sub]
limit = 127 items per run
> 
> First, I don't quite get the 127 limit per run, but it
doesn't
> seem to impact on performance anyway. 

The maximum number of items of each type (spam and non-spam)
that
process-quarantine can handle in one pass is 127, so if you
try to set
the limit higher than this you'll see that warning and the
upper limit
of 127 will be used instead.  The reason for the 127 limit
is that the
return code for the process-quarantine-sub worker script is
limited to a
single byte value (0-255), and the return code is used by
the master
script to determine how many items were actually processed
during that
run by the worker script.  With 0-127 spam items plus 0-127
non-spam
items, the result ends up in the 0-254 range, with code 255
reserved as
an error result.

In general, though, you should be using fairly small run
sizes anyway,
as a guard against memory allocation issues with Perl's
garbage
collector.  Just as it's wise to have amavisd-maia children
commit
suicide and respawn after 10 or so mail items have been
processed, you
should have process-quarantine do the same, otherwise memory
usage tends
to creep upward.  The default run size is 5 items, for
instance, but
remember that the master script will keep running the worker
script as
many times as it needs to in order to get through the items
that need to
br processed.  So if you use a small limit value, you'll do
more runs,
but those runs will end up consuming less memory than you
would if you
processed more items during each run.


> 2006-07-16 10:35:45 Maia: [process-quarantine-sub]
TIMING [total 360429 ms] - decrypt-msg: 267 (0%), msg-prep:
8 (0%), send-report: 360136 (100%), delete-mail: 19 (0%),
rundown: 0 (0%)
> 
> 6 minutes !! And system load up to >2...
> 
> I don't see where the culprit lies -- except that
it's learning
> that seems to take forever. Spamassassin over
MySQL/InnoDB. Any
> idea, hint on how to ameliorate this ?

It's not so much the learning that's taking the time, but
the timeouts
that you're experiencing during the reporting phase.  Try
setting the
debug level to "all" instead of
"info", and you can see which of the
reporting servers you're having trouble with.  More than
likely it's the
Pyzor servers, which are notorious for going down for weeks
at a time,
and being hard to reach the rest of the time.  If you
discover that one
of these services is to blame, you can speed things up by
not reporting
to that particular service, either by editing
$report_options in your
maia.conf file:

# Reporting options (add values together as desired):
#    0 = none (don't report spam)
#    1 = report to Razor
#    2 = report to Pyzor
#    4 = report to DCC
#    8 = report to SpamCop
#$report_options = 0;
$report_options = 1 + 4 + 8;

or by using the "--no-pyzor" flag when you call
process-quarantine.pl.

Note that for this to work, you need to apply a small patch
to
SpamAssassin, as explained in
<htt
ps://secure.renaissoft.com/maia/ticket/288>, since
the SpamAssassin
devs still haven't fixed this trivial bug (as of SA 3.1.3).

-- 
Robert LeBlanc <rjlrenaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamail
guard.com/>

_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
Reporting takes forever...
user name
2006-07-16 10:57:45
Le Sun, Jul 16, 2006 at 03:47:33AM -0700, Robert LeBlanc a
écrit:
> It's not so much the learning that's taking the time,
but the timeouts
> that you're experiencing during the reporting phase. 
Try setting the

Seems not...

Spent a long time here:

[12961] dbg: textcat: X-Languages: "zh.big5",
X-Languages-Length: 10000
[LOOOOOOOOOOOOOOOONG TIME]
[12961] dbg: bayes: seen
(c47eaa6577507589ffe1a35bd944cbd7ead7133fsa_generated) put
[12961] dbg: bayes: learned
'c47eaa6577507589ffe1a35bd944cbd7ead7133fsa_generated', atime: 1153047049
[12961] dbg: markup: removing markup

So it does really seem it's learning that's kinda the time
eater...

All the rest went in a few seconds.

	Arnaud.
-- 
Perso: http://launay.org/blog/
Hébergement: http://www.nocworld.com/
_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
Reporting takes forever...
user name
2006-07-16 21:26:53
Arnaud Launay wrote:

> Spent a long time here:
> 
> [12961] dbg: textcat: X-Languages:
"zh.big5", X-Languages-Length: 10000
> [LOOOOOOOOOOOOOOOONG TIME]
> [12961] dbg: bayes: seen
(c47eaa6577507589ffe1a35bd944cbd7ead7133fsa_generated) put
> [12961] dbg: bayes: learned
'c47eaa6577507589ffe1a35bd944cbd7ead7133fsa_generated', atime: 1153047049
> [12961] dbg: markup: removing markup
> 
> So it does really seem it's learning that's kinda the
time eater...

All right then, it's time to look at your Bayes database. 
Are you
storing your Bayes database in an SQL database, or in the
older-style
Berkeley DB format?  Unless your traffic levels are quite
small, I would
definitely suggest using an SQL database, since at higher
volume the
Berkeley DB tends to create a lot of lockfile-related
delays.  If you're
using a MySQL database for your Bayes, make sure you're
using InnoDB
tables (which offer row-level locking), and not MyISAM
tables (which
offer only table-level locking).

It's possible in either case that you've got a large
number of tokens
that haven't been properly expired.  You can try
"sa-learn --dump magic"
to get a look at some Bayes-related statistics, and
"sa-learn
--force-expire" (done as your amavis/maia user, of
course) to force an
expiry run to take place.

You may also be able to speed things up by enabling
auto-learning in
your SpamAssassin local.cf file.  In that case, the Bayes
training for
given item takes place at the time the email is scanned, so
you get a
small additional delay at scan-time instead of one big delay
when the
process-quarantine script is run.  The process-quarantine
script will
only try to train the Bayes database with items that have
not already
been learned, or whose learning status has changed (e.g.
false
positives/negatives), so this should reduce the number of
items that
process-quarantine has to deal with.  The tradeoff, of
course, is that
every mail item's scan time increases somewhat.

Finally, if you're using an SQL database for your Bayes,
investigate
whether the delays might be related to database I/O
problems.  Is the
database server too busy for the hardware (would more RAM or
processor
power, or moving the database to a dedicated host help)?  Is
there a
large queue of queries and statements waiting to execute? 
Are there
indexes (bayes_token_idx1, bayes_token_idx2) on the
bayes_token table?

-- 
Robert LeBlanc <rjlrenaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamail
guard.com/>

_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
Reporting takes forever...
user name
2006-07-16 22:05:31
Le Sun, Jul 16, 2006 at 02:26:53PM -0700, Robert LeBlanc a
écrit:
> All right then, it's time to look at your Bayes
database.  Are you

As said early, MySQL/InnoDB.

> It's possible in either case that you've got a large
number of tokens
> that haven't been properly expired.  You can try
"sa-learn --dump magic"
> to get a look at some Bayes-related statistics, and
"sa-learn
> --force-expire" (done as your amavis/maia user,
of course) to force an
> expiry run to take place.

force-expire reports this:
[17775] dbg: bayes: expiry check keep size, 0.75 * max:
112500
[17775] dbg: bayes: token count: 250474, final goal
reduction size: 137974
[17775] dbg: bayes: first pass? current: 1153086364, Last:
1153048257, atime: 44796, count: 187026, newdelta: 60721,
ratio: 1.35551625668604, period: 43200
[17775] dbg: bayes: can do estimation method for expiry,
skipping first pass
expired old bayes database entries in 82 seconds
244402 entries kept, 6072 deleted
token frequency: 1-occurrence tokens: 16.05%
token frequency: less than 8 occurrences: 97.37%
[17775] dbg: bayes: expiry completed


So it seems to me it's run at regular interval (twice a day
?
strange). Anyway, that one ran quickly...

> You may also be able to speed things up by enabling
auto-learning in
> your SpamAssassin local.cf file.  In that case, the
Bayes training for

Is it this:
#   Use Bayesian classifier (default: 1)
# use_bayes 1

If so, it's enabled by default...

I'm logging the sql queries, it seems that the culprit is
in
LOOOOT of those requests:
                     18 Query       SELECT spam_count,
ham_count, atime
               FROM bayes_token
              WHERE id = '1'
                AND token = '„ìíÞ^'
                     18 Query       INSERT INTO bayes_token
                   (id, token, spam_count, ham_count, atime)
                   VALUES
('1','„ìíÞ^','1','0','1153087157')


> process-quarantine has to deal with.  The tradeoff, of
course, is that
> every mail item's scan time increases somewhat.

I sincerely hope right now that autolearn isn't really
enabled...

> database server too busy for the hardware (would more
RAM or processor
> power, or moving the database to a dedicated host
help)?  Is there a
> large queue of queries and statements waiting to
execute?  Are there
> indexes (bayes_token_idx1, bayes_token_idx2) on the
bayes_token table?

It certainly spends time crunching on the DB, but the
indexes are
there... The thing that could help of course is moving it to
a
better HD/proc config, but I don't feel like giving up so
quick
without trying things before  I doubt I
extracted the most this
server can do.

	Arnaud.
-- 
Perso: http://launay.org/blog/
Hébergement: http://www.nocworld.com/
_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
Reporting takes forever...
user name
2006-07-16 23:00:41
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Arnaud Launay wrote:

> It certainly spends time crunching on the DB, but the
indexes are
> there... The thing that could help of course is moving
it to a
> better HD/proc config, but I don't feel like giving up
so quick
> without trying things before  I doubt I
extracted the most this
> server can do.

I've had slow performance with bayes lately... I think it
got worse in one of
the last 1 or 2 SA upgrades.

And yet, it's the query that is taking so long.  Is there
any problem with the
fact that it is a char(5) column but the data appears to be
binary?

- --
David Morton
Maia Mailguard                        - http://www.maiamailguard
.com
Morton Software Design and Consulting - http://www.dgrmm.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org


iD8DBQFEusUZUy30ODPkzl0RAvY9AKDULWcfr6hkZM34h53XPZzTf78IKwCe
KmnY
c81jwqnatYxSkrSo5MohgYg=
=3NHM
-----END PGP SIGNATURE-----
_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
Reporting takes forever...
user name
2006-07-16 23:16:12
Arnaud Launay wrote:

> force-expire reports this:
> [17775] dbg: bayes: expiry check keep size, 0.75 * max:
112500
> [17775] dbg: bayes: token count: 250474, final goal
reduction size: 137974
> [17775] dbg: bayes: first pass? current: 1153086364,
Last: 1153048257, atime: 44796, count: 187026, newdelta:
60721, ratio: 1.35551625668604, period: 43200
> [17775] dbg: bayes: can do estimation method for
expiry, skipping first pass
> expired old bayes database entries in 82 seconds
> 244402 entries kept, 6072 deleted
> token frequency: 1-occurrence tokens: 16.05%
> token frequency: less than 8 occurrences: 97.37%
> [17775] dbg: bayes: expiry completed
> 
> 
> So it seems to me it's run at regular interval (twice
a day ?
> strange). Anyway, that one ran quickly...

Twice a day (i.e. every 12 hours) is the most often that you
can expire
the Bayes tokens; if you try to do an expiry within 12 hours
of the last
expiry, it will only do an "estimation pass" (as
it did for you in this
case), in which it will not expire as many tokens (because
presumably
these tokens are newer, fresher, and more relevant).  Still,
the fact
that your last expiry was so recent is a good sign, as it
implies that
the auto-expiry mechanism is engaged and working.

You can ensure explicitly that auto-expiry is enabled by
setting
"bayes_auto_expire 1" in your local.cf, in which
case the expiry is done
opportunistically at scan-time, letting SpamAssassin decide
whether any
tokens need to be expired on its own.  If you prefer to
schedule expiry
of the Bayes tokens yourself, set "bayes_auto_expire
0" and schedule a
cron job to run "sa-learn --force-expire" at
regular intervals (at least
12 hours apart; once a day is fine in most cases).  This
will speed up
scan times a bit, but at the cost of a longer expiry process
when it
does run (essentially the same tradeoff that comes with
auto-learning).


>>You may also be able to speed things up by enabling
auto-learning in
>>your SpamAssassin local.cf file.  In that case, the
Bayes training for
> 
> 
> Is it this:
> #   Use Bayesian classifier (default: 1)
> # use_bayes 1
> 
> If so, it's enabled by default...

No, it's these:

bayes_auto_learn                    1
bayes_auto_learn_threshold_nonspam  -0.001
bayes_auto_learn_threshold_spam     10.0

You should set the thresholds at conservative levels, such
that when the
mail scores below the non-spam threshold it can be
auto-learned as
non-spam, and when it scores higher than the spam threshold
it can be
auto-learned as spam.  This is particularly useful if
you've got a lot
of users who neglect their quarantines/caches, such that
you're expiring
a lot of unconfirmed mail.  Since the auto-learning process
takes place
at scan-time, it can at least catch the more
conservative-scoring items
and use them for Bayes training.


> I'm logging the sql queries, it seems that the culprit
is in
> LOOOOT of those requests:
>                      18 Query       SELECT spam_count,
ham_count, atime
>                FROM bayes_token
>               WHERE id = '1'
>                 AND token = '„ìíÞ^'
>                      18 Query       INSERT INTO
bayes_token
>                    (id, token, spam_count, ham_count,
atime)
>                    VALUES
('1','„ìíÞ^','1','0','1153087157')

In that case it looks like a database issue, or perhaps more
correctly a
disk I/O issue, if the database is write-limited by your
storage
subsystem.  Striping the database across several disks (e.g.
RAID-0, -5,
-10, etc.) would make more drive heads available for reading
and
writing, improving your performance in that regard.

-- 
Robert LeBlanc <rjlrenaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamail
guard.com/>


_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
Reporting takes forever...
user name
2006-07-16 23:50:21
David Morton wrote:

> Is there any problem with the
> fact that it is a char(5) column but the data appears
to be binary?

There may be something to that, actually.  I noticed during
the
SpamAssassin 3.0-3.1 upgrade the SpamAssassin devs changed
the column
type from "CHAR" to "BYTEA" in the
PostgreSQL schema, but no changes
were made to the MySQL schema.  However, checking MySQL's
docs regarding
the CHAR data type, this little comment caught my eye:

> Before MySQL 5.0.3, if you need a data type for which
trailing spaces are not removed, consider using a BLOB or
TEXT type. Also, if you want to store binary values such as
results from an encryption or compression function that
might contain arbitrary byte values, use a BLOB column
rather than a CHAR or VARCHAR column, to avoid potential
problems with trailing space removal that would change data
values.

For users of MySQL 5+, of course, the new
"BINARY" type would probably
be the proper solution, replacing "CHAR(5)" with
"BINARY(5)".

The main difference between BINARY and CHAR types, though,
is simply
that with CHAR types, the binary value is mapped to a local
character
set, so that for instance a byte with decimal value 65 gets
represented
as 'A' if your character set happens to be US-ASCII or
ISO-8859-1.  As
long as the character set that MySQL is using is the same as
the one
SpamAssassin is using, the BINARY and CHAR types ought to be
functionally equivalent.

-- 
Robert LeBlanc <rjlrenaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamail
guard.com/>

_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
[1-8]

about | contact  Other archives ( Real Estate discussion Medical topics )