List Info

Thread: excessive time spent in sql-update?




excessive time spent in sql-update?
country flaguser name
United States
2007-08-17 15:48:15
   Hi folks.  I've been handholding one of my mail servers
for most  
of today as it deals with a large backlog of messages that
are  
seemingly the result of a large flood of spam.  In the
TIMING log  
entries, amavisd-new (v2.4.5) is spending what I consider to
be an  
excessive amount of time in "sql-update"...upwards
of nine seconds in  
some cases, when the entire run is taking close to twenty
seconds. (!)

   In my installation, amavisd-new is logging and
quarantining to an  
instance of MySQL running on a dedicated, reasonably quick
database  
server located on the local network.

   My forte' is C; I'm not much of a Perl hacker, otherwise
I'd dig  
in and find out exactly what's happening in that routine. 
Can  
someone shed a little light on this for me?  Specifically,
I'd like  
to understand specifically what's going on in that routine
(i.e.,  
"it's updating table ABC with information XYZ")
and I'd really like  
some suggestions on how to identify the bottleneck so I can
address it.

   Does anyone have any thoughts on this?

            Thanks,
            -Dave

-- 
Dave McGuire
Port Charlotte, FL
Farewell Ophelia, 9/22/1991 - 7/25/2007




------------------------------------------------------------
-------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and
a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
AMaViS-user mailing list
AMaViS-userlists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user

AMaViS-FAQ:http://www.amav
is.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/ho
wto/

Re: excessive time spent in sql-update?
country flaguser name
United States
2007-08-17 16:42:55
On Aug 17, 2007, at 4:48 PM, Dave McGuire wrote:
>    Hi folks.  I've been handholding one of my mail
servers for most
> of today as it deals with a large backlog of messages
that are
> seemingly the result of a large flood of spam.  In the
TIMING log
> entries, amavisd-new (v2.4.5) is spending what I
consider to be an
> excessive amount of time in
"sql-update"...upwards of nine seconds in
> some cases, when the entire run is taking close to
twenty seconds. (!)
>
>    In my installation, amavisd-new is logging and
quarantining to an
> instance of MySQL running on a dedicated, reasonably
quick database
> server located on the local network.
>
>    My forte' is C; I'm not much of a Perl hacker,
otherwise I'd dig
> in and find out exactly what's happening in that
routine.  Can
> someone shed a little light on this for me? 
Specifically, I'd like
> to understand specifically what's going on in that
routine (i.e.,
> "it's updating table ABC with information
XYZ") and I'd really like
> some suggestions on how to identify the bottleneck so I
can address  
> it.
>
>    Does anyone have any thoughts on this?

   Please forgive me for replying to my own message here,
but I'd  
like to share some information about some significant
progress I've  
just made.  I managed to catch one of amavisd-new's SQL
transactions  
in MySQL's "show full processlist" output, and it
was executing an  
update query on the msgs table with a search field (mail_id)
that  
wasn't indexed.  I added a new index on that field thusly:

   CREATE INDEX msgs_mail_id ON msgs (mail_id);

   ...and the average (eyeballed) query times have dropped
from 8-15  
seconds to less than 100ms, in some cases less than 20ms.

   I'm currently digging for the documentation I used to
create those  
tables to see if I missed the need for that index.

   Can anyone share any other tips here?  What other indices
are  
people using to improve query performance on their
database-backed  
amavisd-new installations?

           -Dave

-- 
Dave McGuire
Port Charlotte, FL
Farewell Ophelia, 9/22/1991 - 7/25/2007




------------------------------------------------------------
-------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and
a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
AMaViS-user mailing list
AMaViS-userlists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user

AMaViS-FAQ:http://www.amav
is.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/ho
wto/

Re: excessive time spent in sql-update?
country flaguser name
Switzerland
2007-08-17 18:01:08
-------- Original-Nachricht --------
Datum: Fri, 17 Aug 2007 17:42:55 -0400
Von: Dave McGuire <mcguireneurotica.com>
An: amavis-userlists.sourceforge.net
Betreff: Re: [AMaViS-user] excessive time spent in
sql-update?

> On Aug 17, 2007, at 4:48 PM, Dave McGuire wrote:
> >    Hi folks.  I've been handholding one of my mail
servers for most
> > of today as it deals with a large backlog of
messages that are
> > seemingly the result of a large flood of spam.  In
the TIMING log
> > entries, amavisd-new (v2.4.5) is spending what I
consider to be an
> > excessive amount of time in
"sql-update"...upwards of nine seconds in
> > some cases, when the entire run is taking close to
twenty seconds. (!)
> >
> >    In my installation, amavisd-new is logging and
quarantining to an
> > instance of MySQL running on a dedicated,
reasonably quick database
> > server located on the local network.
> >
> >    My forte' is C; I'm not much of a Perl hacker,
otherwise I'd dig
> > in and find out exactly what's happening in that
routine.  Can
> > someone shed a little light on this for me? 
Specifically, I'd like
> > to understand specifically what's going on in that
routine (i.e.,
> > "it's updating table ABC with information
XYZ") and I'd really like
> > some suggestions on how to identify the bottleneck
so I can address  
> > it.
> >
> >    Does anyone have any thoughts on this?
> 
>    Please forgive me for replying to my own message
here, but I'd  
> like to share some information about some significant
progress I've  
> just made.  I managed to catch one of amavisd-new's SQL
transactions  
> in MySQL's "show full processlist" output,
and it was executing an  
> update query on the msgs table with a search field
(mail_id) that  
> wasn't indexed.  I added a new index on that field
thusly:
> 
>    CREATE INDEX msgs_mail_id ON msgs (mail_id);
> 
On my setup mail_id is the primary key and MySQL
automatically adds a unique index for the primary key.

Adding another index for mail_id will not do anything bad
except:
- You added a redundant index
- You need more space on the database (for the redundant
index)
- Inserting data to the database will take more time
(maintaining indices is time intensive)


>    ...and the average (eyeballed) query times have
dropped from 8-15  
> seconds to less than 100ms, in some cases less than
20ms.
> 
>    I'm currently digging for the documentation I used
to create those  
> tables to see if I missed the need for that index.
> 
>    Can anyone share any other tips here?  What other
indices are  
> people using to improve query performance on their
database-backed  
> amavisd-new installations?
> 
>            -Dave
> 
> -- 
> Dave McGuire
> Port Charlotte, FL
> Farewell Ophelia, 9/22/1991 - 7/25/2007
> 
> 
> 
> 
>
------------------------------------------------------------
-------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems? 
Stop.
> Now Search log events and configuration files using
AJAX and a browser.
> Download your FREE copy of Splunk now >>  http://get.splunk.com/
> _______________________________________________
> AMaViS-user mailing list
> AMaViS-userlists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/amavis-user

> AMaViS-FAQ:http://www.amav
is.org/amavis-faq.php3
> AMaViS-HowTos:http://www.amavis.org/ho
wto/

-- 
Psssst! Schon vom neuen GMX MultiMessenger gehört?
Der kanns mit allen: http://www.gm
x.net/de/go/multimessenger

------------------------------------------------------------
-------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and
a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
AMaViS-user mailing list
AMaViS-userlists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user

AMaViS-FAQ:http://www.amav
is.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/ho
wto/

Re: excessive time spent in sql-update?
country flaguser name
United States
2007-08-17 19:12:19
On Aug 17, 2007, at 7:01 PM, steeeeeveeegmx.net
wrote:
>>    Please forgive me for replying to my own message
here, but I'd
>> like to share some information about some
significant progress I've
>> just made.  I managed to catch one of amavisd-new's
SQL transactions
>> in MySQL's "show full processlist"
output, and it was executing an
>> update query on the msgs table with a search field
(mail_id) that
>> wasn't indexed.  I added a new index on that field
thusly:
>>
>>    CREATE INDEX msgs_mail_id ON msgs (mail_id);
>>
> On my setup mail_id is the primary key and MySQL
automatically adds  
> a unique index for the primary key.
>
> Adding another index for mail_id will not do anything
bad except:
> - You added a redundant index
> - You need more space on the database (for the
redundant index)
> - Inserting data to the database will take more time
(maintaining  
> indices is time intensive)

   Yup, that makes sense to me.

   A "show index ..." didn't list an index on that
field, which  
suggests that it wasn't defined as a primary key in the
original  
CREATE TABLE statement.  I double-checked the CREATE TABLE
statements  
that I used to set up this database, and sure enough, there
was no  
primary key defined.

   The flood has died down and my mail server is now keeping
up.   
I've also tuned the MySQL server a bit (adjusted some buffer
sizes in  
my.cnf) and reduced the query latency a bit further; most
queries are  
now executing in less than 10ms.

          -Dave

-- 
Dave McGuire
Port Charlotte, FL
Farewell Ophelia, 9/22/1991 - 7/25/2007




------------------------------------------------------------
-------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and
a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
AMaViS-user mailing list
AMaViS-userlists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user

AMaViS-FAQ:http://www.amav
is.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/ho
wto/

Re: excessive time spent in sql-update?
country flaguser name
Switzerland
2007-08-17 19:23:13
-------- Original-Nachricht --------
Datum: Fri, 17 Aug 2007 20:12:19 -0400
Von: Dave McGuire <mcguireneurotica.com>
An: amavis-userlists.sourceforge.net
Betreff: Re: [AMaViS-user] excessive time spent in
sql-update?

> On Aug 17, 2007, at 7:01 PM, steeeeeveeegmx.net
wrote:
> >>    Please forgive me for replying to my own
message here, but I'd
> >> like to share some information about some
significant progress I've
> >> just made.  I managed to catch one of
amavisd-new's SQL transactions
> >> in MySQL's "show full processlist"
output, and it was executing an
> >> update query on the msgs table with a search
field (mail_id) that
> >> wasn't indexed.  I added a new index on that
field thusly:
> >>
> >>    CREATE INDEX msgs_mail_id ON msgs
(mail_id);
> >>
> > On my setup mail_id is the primary key and MySQL
automatically adds  
> > a unique index for the primary key.
> >
> > Adding another index for mail_id will not do
anything bad except:
> > - You added a redundant index
> > - You need more space on the database (for the
redundant index)
> > - Inserting data to the database will take more
time (maintaining  
> > indices is time intensive)
> 
>    Yup, that makes sense to me.
> 
>    A "show index ..." didn't list an index on
that field, which  
> suggests that it wasn't defined as a primary key in the
original  
> CREATE TABLE statement.  I double-checked the CREATE
TABLE statements  
> that I used to set up this database, and sure enough,
there was no  
> primary key defined.
> 
Strange. You should have there a primary key. Anway... can
you check h
ttp://www.ijs.si/software/amavisd/README.sql-mysql and
fix/adjust your table to match the one described in the
document (assuming you run version 2.5.2. If not then search
for the file on your local system). If speed is an issue for
you, then don't create the foreign keys since they slow down
MySQL when purging old data from the AMaViS database.


>    The flood has died down and my mail server is now
keeping up.   
> I've also tuned the MySQL server a bit (adjusted some
buffer sizes in  
> my.cnf) and reduced the query latency a bit further;
most queries are  
> now executing in less than 10ms.
> 
>           -Dave
> 
> -- 
> Dave McGuire
> Port Charlotte, FL
> Farewell Ophelia, 9/22/1991 - 7/25/2007
> 
> 
> 
> 
>
------------------------------------------------------------
-------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems? 
Stop.
> Now Search log events and configuration files using
AJAX and a browser.
> Download your FREE copy of Splunk now >>  http://get.splunk.com/
> _______________________________________________
> AMaViS-user mailing list
> AMaViS-userlists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/amavis-user

> AMaViS-FAQ:http://www.amav
is.org/amavis-faq.php3
> AMaViS-HowTos:http://www.amavis.org/ho
wto/

-- 
Psssst! Schon vom neuen GMX MultiMessenger gehört?
Der kanns mit allen: http://www.gm
x.net/de/go/multimessenger

------------------------------------------------------------
-------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and
a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
AMaViS-user mailing list
AMaViS-userlists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user

AMaViS-FAQ:http://www.amav
is.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/ho
wto/

Re: excessive time spent in sql-update?
country flaguser name
Slovenia
2007-08-17 19:49:27
Giampaolo,

> I guess it's better you don't use the mail_id field as
a primary key: it
> may not be unique in your db. Think, in example, to
messages sent to
> destinations defined as a list or an alias in your
system.

Don't confuse mail_id with a Message-ID header.

mail_id is guaranteed by amavisd to be unique among all
records
currently present in a database.


amavisd-new-2.3.0 release notes:

- introduce a concept of 'mail_id', which is similar to the
am_id as reported
  in the log and elsewhere (e.g. 92598-11-5), but has much
stronger long-term
  uniqueness property and can be used for the purpose of
uniquely identifying
  a quarantined mail, or for other uses. The mail_id is a
12-character string
  consisting of characters [A-Za-z0-9+-] (like base64,
except for a '/' being
  substituted by a '+'), guaranteed to start and to end with
an alphanumeric
  character (i.e. not with '+' or '-'). It is derived by
cryptographically
  strong method (MD5), cumulatively collecting entropy
during the life of
  child processes, folding-in entropy from processed mail
and other cheaply
  accessible sources, collected when an opportunity arises
(e.g. file system
  file-IDs, SA results etc), without placing a burden on
system sources of
  randomness (see RFC 4086). Note that MD5 has been
demonstrated to have some
  weaknesses, but we are not talking about cryptographic
attacks here, but
  rather about spreading messages which have no inherent
intention of causing
  collisions.

  The mail_id carries 71.9 bits of information (subject to
the quality of
  sources). For a high-end system that keeps a year's worth
of mail messages
  in evidence (e.g. in quarantine) and receives 10e6
messages each day
  (20..200 TB of yearly mail contents), the probability of a
mail_id
  collision happening during one year (while gradually
displacing an entire
  collection with a new set of IDs) is n^2/m = 0.3 %
    (10e6 * 365)^2 / (62 * (64^(12-2)) * 62) = (10e6 *
365)^2 / 2^71.9 = 0.003
  Eventual clash is still possible and needs to be detected,
but a re-tried
  mail delivery attempt is acceptable if its probability is
low, as each mail
  processing rolls a dice again. On a smallish system
receiving 10.000 mail
  messages daily an 8-character mail_id would suffice, but
the savings are
  not worth the trouble of providing configuration
flexibility.

  Paired with a mail_id there is its companion secret_id
generated for each
  message, such that mail_id can be derived from secret_id
and pairing checked,
  but not the other way around. The purpose of secret_id is
not yet fully
  developed, but can serve as a 'ticket', granting user a
right to release
  a quarantined message addressed to him.


Mark

------------------------------------------------------------
-------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and
a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
AMaViS-user mailing list
AMaViS-userlists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user

AMaViS-FAQ:http://www.amav
is.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/ho
wto/

[1-6]

about | contact  Other archives ( Real Estate discussion Medical topics )