List Info

Thread: recalc-stats taking LONG time after expire or process quarantine




recalc-stats taking LONG time after expire or process quarantine
country flaguser name
United States
2007-05-09 14:13:38
I've recently (about a week ago) replaced our amavisd-new
spam scanning
setup with Maia 1.0.2. It's working great, even under heavy
load for the
most part. Scanning is keeping up easily. All-in-all, this
is a terrific
system and will give our users a much-needed interface to
amavisd.

The only performance problem I'm seeing is that after a run
of
process-quarantine.pl or of expire-quarantine-cache.pl, it
looks like
the script is finishing up by recalculating the stats for
EVERY user on
the system. This is taking a very long time (upwards of 4
hours). Our
system has about 130K users configured and Maia has
auto-created about
60K users so far. My question is this. What would be the
adverse side
affects of just commenting out the recalc-stats sections of
code in
those scripts? Or, is there a way I can speed up the
updates? 

Here is the SQL that is executing over and over and is
sometimes a bit
slow (up to 15 secs each to run) (Runs 60K times at the end
of each of
the mentioned scripts)

# Query_time: 21  Lock_time: 0  Rows_sent: 1  Rows_examined:
1423
SELECT MIN(received_date) AS mindate, MAX(received_date) AS
maxdate,
MIN(score) AS minscore, MAX(score) AS maxscore, SUM(score)
AS
totalscore, MIN(size) AS minsize, MAX(size) AS maxsize,
SUM(size) AS
totalsize, COUNT(id) AS items FROM maia_mail,
maia_mail_recipients WHERE
maia_mail.id = maia_mail_recipients.mail_id AND
maia_mail_recipients.type = 'S' AND
maia_mail_recipients.recipient_id =
'7087';

I notice that there aren't indexes on the maia_mail.size and
score
fields. Do you think adding indexes for those fields would
help?

mysql> describe maia_mail;
+---------------+------------------+------+-----+---------+-
---------------+
| Field         | Type             | Null | Key | Default |
Extra          |
+---------------+------------------+------+-----+---------+-
---------------+
| id            | int(10) unsigned | NO   | PRI | NULL    |
auto_increment |
| received_date | datetime         | NO   | MUL |        
|                |
| size          | int(10) unsigned | NO   |     |        
|                |
| sender_email  | varchar(255)     | NO   |     |        
|                |
| envelope_to   | text             | NO   |     |        
|                |
| subject       | varchar(255)     | NO   |     |        
|                |
| contents      | longtext         | NO   |     |        
|                |
| score         | float            | YES  |     | NULL   
|                |
+---------------+------------------+------+-----+---------+-
---------------+
8 rows in set (0.13 sec)

Thanks for your time and for this great product.

-- 
  Brian C. Burkhart - Sr. Systems Engineer - OneNet
     Oklahoma State Regents for Higher Education
brianOneNet.net - Ph: 405-225-9444 - Fax: 405-225-9250

_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users

Re: recalc-stats taking LONG time after expire or process quarantine
country flaguser name
Canada
2007-05-13 23:28:01
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Brian C. Burkhart wrote:

> The only performance problem I'm seeing is that after a
run of
> process-quarantine.pl or of expire-quarantine-cache.pl,
it looks like
> the script is finishing up by recalculating the stats
for EVERY user on
> the system. This is taking a very long time (upwards of
4 hours). Our
> system has about 130K users configured and Maia has
auto-created about
> 60K users so far. My question is this. What would be
the adverse side
> affects of just commenting out the recalc-stats
sections of code in
> those scripts? Or, is there a way I can speed up the
updates? 

If stats-tracking doesn't matter to you you can certainly
bypass the
stats updating code.  Of course if you're going to do that,
you might as
well just go to the Admin->System Configuration page and
set "Enable
tracking of statistics?" to "No".

The performance issue, though, is most likely a database
tuning issue.
MySQL in particular seems to ship in a badly-tuned state for
Maia's
purposes, and there are numerous optimizations you can apply
to improve
performance dramatically.  The innodb_buffer_pool_size
setting alone can
easily give you a 2-8x performance boost.  Searching the
archives of
this list will turn up many other posts with suggested
tweaks for MySQL.


> I notice that there aren't indexes on the
maia_mail.size and score
> fields. Do you think adding indexes for those fields
would help?

Possibly; indexing is a double-edged sword, in that it makes
lookups
faster, but makes writes take longer, since the index needs
to be
updated as well.  It's worth the experiment at least, since
you can
always drop the index if it ends up making things worse.

- --
Robert LeBlanc <rjlrenaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamail
guard.com/>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGR+VRGmqOER2NHewRAkiwAJ4n+yVaQrXacRc30a3tldKx1vu24wCe
KjFs
jYFtYUE/otcvxQOLgOcpitg=
=IhEq
-----END PGP SIGNATURE-----
_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )