List Info

Thread: speeding up razor reporting




speeding up razor reporting
user name
2006-08-28 19:39:03
Hi,

I started using Razor2 late last week and it's helped my
spam detection; 
however enabling reporting for it has KILLED my quarantine
processing 
time.  I had to disable reporting in order to make things
work -- one of 
my goals is to keep the learn/report process running at
least once per 
hour and I stopped this mornings' run after six hours. 
Without 
reporting, it will run in probably 45 minutes.  I run it so
often 
because I've got a dedicated database server which runs the
learn/report 
run so it doesn't slow down the relays and having it run
often gives my 
users the benefit of rapid learning.

Is there anything to be done to speed up razor2? 
Alternately, it would 
be a nice feature if the learn/report cycles could be broken
up.  Say, 
allow the --learn to run but don't purge the database until
the --report 
has been run; then I could run the --report at like 10:00 PM
and still 
do the learning often.

Just a thought.

- Aaron

-- 
Aaron Bennett
Sr. Unix Systems Administrator
Clark University ITS
abennettclarku.edu     |     508.781.7315

_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
speeding up razor reporting
user name
2006-08-29 08:44:38
Aaron Bennett wrote:

> I started using Razor2 late last week and it's helped
my spam detection; 
> however enabling reporting for it has KILLED my
quarantine processing 
> time.  I had to disable reporting in order to make
things work -- one of 
> my goals is to keep the learn/report process running at
least once per 
> hour and I stopped this mornings' run after six hours.
 Without 
> reporting, it will run in probably 45 minutes.  I run
it so often 
> because I've got a dedicated database server which
runs the learn/report 
> run so it doesn't slow down the relays and having it
run often gives my 
> users the benefit of rapid learning.
> 
> Is there anything to be done to speed up razor2? 
Alternately, it would 
> be a nice feature if the learn/report cycles could be
broken up.  Say, 
> allow the --learn to run but don't purge the database
until the --report 
> has been run; then I could run the --report at like
10:00 PM and still 
> do the learning often.

To a point, reporting will always be at the mercy of network
latency and
server availability on the part of the report servers.  That
said, you
/do/ have some control over which services you report to. 
Assuming
you've applied the patches from Ticket #288
<htt
ps://secure.renaissoft.com/maia/ticket/288>, you
should be able to
selectively enable/disable reporting to services as you see
fit, either
from the command-line or in your maia.conf file.

Without these patches, SpamAssassin will try to report to
any services
you have installed and use for lookups.  What this means is
that if you
have the Razor, Pyzor, DCC, and SpamCop plugins installed
and you use
them to score your mail, SpamAssassin will try to send its
reports to
all of them, no matter what you've tried to tell it in the
maia.conf
file.  The patches are needed to make SpamAssassin handle
reporting as
advertised.  The SpamAssassin devs finally accepted my
patches last
week, so hopefully this should be resolved in SpamAssassin
3.1.5.

Presuming you've applied the patches, then, I'd advise you
to disable
reporting to Pyzor, since that's the one that's least
reliable and
suffers from the most server downtime.  Not having to wait
an extra 10
seconds for Pyzor servers for every report makes a vast
improvement.

The other thing you can do, of course, is reduce the timeout
values (in
seconds) in your local.cf file for the services you're
reporting to, e.g.

  dcc_timeout     10
  pyzor_timeout   5
  razor_timeout   10

As for separating learning and reporting tasks, it's
certainly
conceivable as a future feature, but bear in mind that for
the same
reasons that you want to do your Bayes training frequently,
the
reporting services would prefer you to do your reporting
frequently as
well, since a large part of their benefit comes from having
received a
large number of spam reports about an item before you
receive your copy
of it.  If you delay your reporting until late in the
evening, millions
of people will have already received copies of the spam
without the
benefit of your report to help classify it.  That's why
reporting and
Bayes training really go hand in hand--just as the local
learning helps
improve the effectiveness of your Bayes database, the
reporting helps
improve the effectiveness of Razor/Pyzor/DCC/SpamCop.

Don't get overly concerned about the fact that a
process-quarantine run
takes hours to run, either; by all means schedule it to run
hourly, it
won't start a new run until the current run completes, and
the current
run periodically checks for new items to process as it goes,
so you're
not really "falling behind" as much as you might
think.  At very busy
sites in fact, the process-quarantine run is effectively a
continuous,
24/7 process--a run that never ends, constantly learning and
reporting
items as it manages its queue in a slow but steady stream. 
During hours
with lighter traffic it gets to "catch up" a
bit, while it "falls
behind" during peak traffic hours.  As long as it
manages to get through
a full day's items within 24 hours or less, it all balances
out.  It's
only a problem if it starts taking consistently longer than
a day to
process a day's worth of items.

-- 
Robert LeBlanc <rjlrenaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamail
guard.com/>

_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
speeding up razor reporting
user name
2006-08-29 09:50:56
Robert LeBlanc wrote:
> /do/ have some control over which services you report
to.  Assuming
> you've applied the patches from Ticket #288
> <htt
ps://secure.renaissoft.com/maia/ticket/288>, you
should be able to
> selectively enable/disable reporting to services as you
see fit, either
> from the command-line or in your maia.conf file.
>   
Could this patches be applied also to new versions of
spamassassin ? or 
only applies to 3.1.1 ?

Thx
_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
speeding up razor reporting
user name
2006-08-29 10:01:52
Davide Bozzelli wrote:
> Robert LeBlanc wrote:
> 
>>/do/ have some control over which services you
report to.  Assuming
>>you've applied the patches from Ticket #288
>><htt
ps://secure.renaissoft.com/maia/ticket/288>, you
should be able to
>>selectively enable/disable reporting to services as
you see fit, either
>>from the command-line or in your maia.conf file.
>>  
> 
> Could this patches be applied also to new versions of
spamassassin ? or 
> only applies to 3.1.1 ?

The patches for the Pyzor and SpamCop plugins should work
fine with
SpamAssassin 3.1.4.  The patch for the DCC plugin has
changed a bit, but
the basics are the same
("$self->->"
still
becomes
"$options->->->{dont_report_
to_dcc}").  Just
search for "dont_report_to_dcc" to find the line
you need to change.

-- 
Robert LeBlanc <rjlrenaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamail
guard.com/>

_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
speeding up razor reporting
user name
2006-08-29 13:14:41
Robert LeBlanc wrote:
>
>  At very busy
> sites in fact, the process-quarantine run is
effectively a continuous,
> 24/7 process--a run that never ends, constantly
learning and reporting
> items as it manages its queue in a slow but steady
stream.  During hours
> with lighter traffic it gets to "catch up"
a bit, while it "falls
> behind" during peak traffic hours.  As long as it
manages to get through
> a full day's items within 24 hours or less, it all
balances out.  It's
> only a problem if it starts taking consistently longer
than a day to
> process a day's worth of items.
>
>   
ahhh.  That's brilliant.  I didn't realize that at all. 
Fair enough, 
I'll leave reporting enabled and not worry about it. 
Excellent news.

-- 
Aaron Bennett
Sr. Unix Systems Administrator
Clark University ITS
abennettclarku.edu     |     508.781.7315

_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
[1-5]

about | contact  Other archives ( Real Estate discussion Medical topics )